Automating startup evaluation: from qualitative descriptions to a multidimensional, text-based measure of maturity and investment readiness

Mariannelli, Veronica

In recent decades, the world of startups has found itself playing a central role in the analysis of economic, technological and social transformations. Startups are not only one of the main engines of innovation, capable of profoundly changing market dynamics or increasing productivity, but they contribute to reshaping markets, industries, and innovation trajectories (Zacharakis and Meyer, 1998). The assessment of the maturity of startups, especially in their initial phase of life, is a question that the academic community has been facing for some time but until now it has remained an open question. In fact, it remains only partially addressed in the literature, as it is characterized by often heterogeneous approaches that are difficult to replicate (Gao et al., 2021). In this scenario, there is a strong need to develop analytical tools capable of combining technical rigor, operational capacity and scalability, in order to support investors' decision-making processes and entrepreneurs' strategic choices (Davila et al., 2003). Business ecosystems have become increasingly complex and require multidimensional approaches that go beyond traditional evaluation logics based only on financial indicators or non-formalized expert judgments (Franke et al., 2008). In particular, investment selection and screening activities are influenced by information asymmetry between entrepreneurs and investors, the presence of cognitive biases that characterize decision-making processes and also operational constraints due to the time and resources available for this activity (Singhal et al., 2022) (Busenitz and Barney, 1997). Traditional practices are fundamentally based on a qualitative interpretation of narrative documents which, due to their unstructured nature, makes a systematic and replicable analysis difficult. This has the consequence that startups are evaluated through a process that is not very standardized, difficult to scale and highly dependent on subjectivity. This criticality is even more evident if we consider the growing importance of dimensions such as sustainability, technological innovation and organizational capacity, which can hardly be captured through traditional evaluation tools (Aspelund et al., 2005) (Colombo and Grilli, 2005). The lack of a shared language and objective metrics limits comparability between startups and reduces the effectiveness of financial resource allocation processes. In the literature, models for the analysis of startups are widespread, ranging from frameworks based on organizational and strategic dimensions to tools focused on risk assessment and performance evaluation (Chandler and Hanks, 1993) . Among these, the concept of IRL (Investment Readiness Level) is particularly relevant, which is based on a classification of startups based on their degree of investment readiness. However, these approaches have significant limitations. These tools are in fact often limited by a reduced level of operationalization and a poor analytical granularity. In fact, they are mainly based on self-declared data or subjective evaluations, thereby raising reliability concerns. They are also not easily scalable and are poorly suited for contexts characterized by the availability of unstructured data. In particular, textual descriptions of startups in natural language are an extremely important source but still little used in systematic analysis processes (Barry, 1994) (Dubini and Aldrich, 1991). Moreover, as explored in Chapter 1, these models rarely systematically integrate fundamental dimensions such as technological innovation and sustainability, which today represent key factors for the competitiveness and attractiveness of startups to investors. This theoretical and applicative gap is one of the central elements on which the contribution of this work is grafted. At the same time, the growing availability of unstructured textual data (presentation documents, business plans and informative reports) is an opportunity that is still little used. These data represent relevant strategic information, but their narrative structure requires tools capable of transforming them into observable and measurable data. Precisely for this reason, recent developments in the field of natural language processing and AI open up possibilities for overcoming these limitations. The integration between textual analysis techniques and structured evaluation models therefore represents a particularly promising research frontier, as will be developed in Chapter 4, able to bridge the gap between the wealth of information of qualitative data and the need for quantitative synthesis required by decision-making processes. This PhD thesis introduces and develops the SUS²A (Strategic Unbiased Startup Sustainability Awareness) framework, with the aim of positioning itself in the gap created between the theoretical models for assessing the maturity of startups and the methods represented by automatic natural language analysis technologies (Baum and Silverman, 2004). This study offers as a fundamental contribution the design of a structured system capable of transforming unstructured textual data into replicable quantitative indicators. More specifically, this study aims to transform qualitative documents into more structured and replicable quantitative indicators, reducing at the same time interpretative ambiguity and increasing the possibility of comparing startups with each other (Dhochak et al., 2024). In particular, the framework is part of the studies on Investment Readiness Level (IRL), deepened in Chapter 1, enriching the theoretical framework through greater analytical granularity and operational formalization based on textual evidence. In this sense, SUS²A can be interpreted as an evolution of existing models, capable of integrating qualitative and quantitative dimensions into a single coherent system. SUS²A is based on a multidimensional conception of startup maturity, based on eleven fundamental categories, including team, strategy, finance, technology, validation, ESG and scalability, identified through a literature analysis. This theoretical structure is then translated into a granular system consisting of 500 binary questions, designed to systematically detect the presence or absence of certain characteristics within the analyzed texts. This structure and methodological choice represents an element that distinguishes this work, as it makes it possible to overcome, through an explicit and verifiable formalization of the evaluation criteria, the vagueness that characterizes qualitative approaches. The construction of such a system of questions derives directly from the bibliometric analysis developed in Chapter 2, which allows the researcher to rigorously identify the relevant dimensions and to translate them into a structured operational framework. The choice to use a system of binary questions also makes it possible to reduce interpretative subjectivity, favoring the replicability of the results and making the model applicable on a large scale, even in contexts characterized by high information heterogeneity. The present study is developed as a progressive process divided into three closely interrelated and interconnected phases (Behrens et al., 2012). Firstly, a theoretical conceptualization phase, where the key categories and questions for the analysis of the maturity of startups are defined and identified and systematized through a literature analysis. Secondly, the operationalization phase, where these dimensions are translated into a structured system of questions and a quantitative scoring model. Finally, the automation phase, where the evaluation process is implemented with an automated textual analysis tool (Si et al., 2023). This last phase compares deterministic approaches based on keyword matching techniques and semantic approaches based on advanced language models (LLMs), with the aim of evaluating their reliability, scalability and degree of agreement. This articulation reflects a logical progression that guides the reader from the definition of the problem to the proposal of an operational and technologically advanced solution, developed and validated in Chapter 3 and Chapter 4. The entire thesis is therefore developed as a unitary path divided into four main chapters. Chapter 1 introduces the SUS²A theoretical model and the connection with the concept of Investment Readiness Level (IRL), providing the conceptual framework of reference. Chapter 2 deepens the construction of the framework through a bibliometric analysis and the definition of the categories and 500 questions (Dhochak and Doliya, 2020). Chapter 3 is dedicated to the definition of the scoring model and its empirical application, showing the results obtained from the analysis of startups. Finally, Chapter 4 develops an agent model that simulates the interaction between startups and investors in different conditions and in different market regimes, making available values related to the intrinsic properties of startups and market values that are easily comparable within the same market situation and in different regimes. This structure allows the reader to be progressively accompanied along all phases of the research path, offering an integrated view that connects theory, methodology and empirical evidence and highlighting the incremental contribution of each chapter. Chapters 1 and 2 jointly contribute to the theoretical and methodological construction of the model, forming the basis of the first scientific contribution. Chapter 3 corresponds to the empirical application and validation phase of the framework, while Chapter 4 develops a further contribution, focused on automation and the use of agent-based modeling techniques. This articulation makes it possible to maintain both narrative coherence and clarity in the presentation of scientific contributions. To further clarify the integrative logic connecting the four papers, it is useful to make the cumulative argument explicit. Chapter 1 establishes the theoretical necessity of SUS²A by demonstrating the insufficiency of existing IRL and startup maturity frameworks along three dimensions: granularity, domain completeness, and text-processing capability. Chapter 2 operationalizes this theoretical gap by showing — through a systematic bibliometric analysis of 1,297 articles — which dimensions of startup readiness the scientific literature considers most critical, and by translating these findings into 500 validated binary indicators through a Delphi expert process. Chapter 3 empirically tests the resulting instrument on 41 startups, comparing a deterministic keyword-matching approach against an LLM-based semantic evaluator to establish reliability and identify the framework’s diagnostic patterns. Chapter 4 extends the framework beyond static screening by embedding SUS²A as a quality measure within an agent-based simulation of startup-investor dynamics across different market regimes, demonstrating its utility as both an analytical and a modeling instrument. Each chapter therefore builds on the previous one: the theoretical gap identified in Chapter 1 motivates the bibliometric design of Chapter 2; the validated instrument from Chapter 2 enables the empirical application of Chapter 3; and the empirical findings from Chapter 3 provide the calibrated measure used in Chapter 4. This sequential dependency is not merely narrative but methodological: no chapter could stand alone without the foundation provided by the preceding one. The empirical results highlight how the model is able to systematically analyze a certain number of startups, identifying recurring patterns and structural gaps (Emir Hidayat et al., 2022). Specifically, a lack emerges in the values related to the categories of ESG and talent management, against a greater focus on more traditional categories such as product and market. Furthermore, by comparing deterministic and semantic approaches, the comparison highlights a high level of agreement, attributing validity to the model as an effective and replicable preliminary screening tool (Bertoni et al., 2011). These results, discussed in particular in Chapter 3 and Chapter 4, reinforce the idea that a structured and automated approach can significantly contribute to improving the quality of evaluations, reducing the margins of error and increasing the transparency of decision-making processes. The originality of the work lies in bringing together a traditional idea based on textual analysis with an advanced computational methodology, thus contributing to the academic debate on several levels. The contributions of this study are in fact placed along a continuum that crosses the different chapters: in the first part it reinforces the construct linked to the concept of maturity, then it arrives at an operational structuring, in the third phase it adds an empirical validation and finally extends the application to a comparative agent simulation. This progression allows to highlight not only the originality of the individual results, but also the systemic value of the entire proposed framework. The framework presented, unlike other existing models, introduces an analysis of the level of maturity of startups with a higher level of granularity, allowing to obtain a defined, detailed and quantitative representation of the characteristics of startups. This tool is not only a means of analysis, but a real paradigm in the way startups can be analyzed and evaluated, integrating scientific rigor, methodological innovation and potential through the use of emerging computational methods and describing each fundamental step along the four chapters. On a theoretical level, the work proposes a new formalization of the concept of maturity of a startup, going beyond static and one-dimensional visions. From the point of view of the methodological level, a replicable process of automated textual analysis is introduced. This type of analysis makes it possible to overcome the subjectivity that characterizes qualitative evaluations. From the point of view of the application and practical level, this work offers a potentially relevant tool that could be useful to investors, incubators and policy makers who aim to evaluate a large number of entrepreneurial realities in a systematic and comparable way. The thesis is structured in four papers, each dedicated to a specific phase of the research path. The first paper focuses on the theoretical construction of the model, analyzing the literature and defining the categories and the set of questions. The second is focused on the construction of the scoring model and its application as a structured evaluation tool. The third analyzes the automation of the process and assesses reliability through a comparison between different textual analysis techniques. Finally, the fourth paper simulates the relationships between startups and investors in different market regimes through agent modeling. Altogether, these contributions make it possible to outline a coherent path that makes it possible to move from theoretical conceptualization to empirical validation of an automated evaluation system. This articulation in papers therefore reflects a synthesis of the contents developed in the four chapters, while ensuring a clear identification of the scientific contributions and an overall coherence of the research path. In this sense, the papers do not represent isolated elements, but integrated parts of a unitary design that finds its complete and detailed explication in the chapters. In this study, theory, methodology and practice are put together and the foundations are laid for future development in the field of automated analysis of startup maturity, in particular in the transformation of qualitative data into quantitative data. SUS²A is therefore an attempt to redefine the way startups are analyzed and evaluated objectively, introducing a paradigm based on the transformation of textual descriptions in natural language into structured knowledge and supporting the evolution towards more objective, transparent and scalable decision-making processes. In the light of this structure, the research questions not only guide the development of the four chapters, but are distributed transversely, ensuring consistency between theoretical objectives, methodological development and empirical validation. In the light of what emerged from the analysis of the literature and the objectives of this study, the research aims to answer a series of research questions, organized coherently with the three phases that structure SUS²A. A first area of analysis is focused on the theoretical and conceptual dimension of startup maturity. The research questions the way in which the concept of entrepreneurial maturity is defined, in a systematic and multidimensional way, overcoming simplified approaches. In particular, the first research question is: RQ1: How can the maturity of startups be contextualized and structured in a multidimensional framework based on existing literature? A second area of analysis is based on the process of operationalization of the theoretical construct. The goal is to understand if and how it is possible to transform abstract dimensions into observable and measurable indicators, reducing the interpretative errors typical of qualitative evaluations. The second research question is therefore the following: RQ2: How is it possible to operationalize the maturity of a startup through a structured and granular system of indicators based on textual data? Finally, the third and fourth areas of analysis are the methodological and applicative dimension. In particular, we want to refer to the automation of analysis processes. In this context, the role of AI for the evaluation of startups is analyzed, comparing alternative approaches and their reliability. The research questions are as follows: RQ3: How can the startup evaluation process be automated through natural language text analysis techniques applied to unstructured textual data? RQ4: What is the level of reliability and agreement between deterministic approaches based on keyword matching and semantic approaches based on advanced language models in determining the maturity of a startup? These questions are the common thread on which the entire research work is developed, defining a path that goes from the theoretical definition of the model to its operational translation and subsequent empirical validation, with the aim of obtaining more structured, replicable and less subjectively dependent tools for the analysis of startups.