1. Introduction

Modeling gender disparities in citation impact using co-authorship network metrics

Dilnaz Imanbayeva

1 2

Oleksandr Kuchanskyi

a.kuchanskyi@astanait.edu.kz 0 1 2 0 Department of Biomedical Cybernetics, National Technical University of Ukraine 'Igor Sikorsky Kyiv Polytechnic Institute' , Kyiv 03056 , Ukraine 1 School of Artificial Intelligence and Data Science, Astana IT University , Astana 010000 , Kazakhstan 2 WDA'26: International Workshop on Data Analytics

2026

This study investigates the efect of author gender on scientific productivity and citation impact using large-scale network analysis of scholarly collaboration. The analysis is based on OpenAlex data covering 47,314 publications and 355,193 authors from 2021-2025. Co-authorship networks are constructed and analyzed using multiple centrality measures, including degree, betweenness, harmonic closeness, and eigenvector centrality. Regression models are applied at both the author and publication levels to control for network position and productivity efects. Author gender is inferred using Gender-API with a confidence threshold of 0.6, resulting in 206,000 classified authors. The results reveal a “Network Advantage Paradox”: despite slightly higher centrality values for female authors across most metrics, their citation counts are on average 5.5% lower than those of comparable male authors. At the publication level, papers produced by all-female teams receive 56.7% fewer citations than those by all-male teams, while mixed-gender teams achieve a 30.9% citation advantage. Furthermore, increasing gender diversity, measured by the Blau index, is associated with a substantial non-linear growth in citation impact, reaching up to a 13-fold diference between the lowest and highest diversity levels. These findings provide quantitative evidence of persistent gender bias in collaborative science and inform research evaluation and science policy.

eol>Gender disparity network analysis scientific collaboration citation impact scientific productivity

1. Introduction

At present, the scientific community lacks a method that reliably assesses how an author’s gender influences their citation metrics and scientific productivity. One promising approach is to combine network analysis with traditional bibliometric methods that have long been used in research evaluation. This integrated perspective makes it possible to uncover more complex patterns and hidden structures of scientific collaboration and citation that remain invisible to conventional approaches. The development of network-based analytical methods can substantially improve the accuracy of evaluating gender disparities in academia, making this line of research both timely and significant.

Traditional bibliometric methods do not always take into account the relationships between authors during the production of a paper [ 1 ]. Network analysis, by contrast, explicitly incorporates co-authorship, ties between authors, citation networks, and the influence of researchers within the scientific community, thereby increasing the objectivity of assessing gender-related patterns in research [ 2 ].

Over the past few decades, network-based methods have been increasingly applied across various ifelds, including the social sciences, linguistics, and scientometrics [ 2, 3, 4 ]. Recent studies have also focused on integrating network analysis into the assessment of scientific productivity among male and female authors [ 5 ]. However, in most cases, existing approaches rely primarily on conventional statistical techniques, which limits their ability to capture and analyse the internal structure and interconnectedness of academic networks.

This study seeks to create a way of measuring how gender traits afect citations and research output through network analysis. By applying this method, current tools used in gender gap research could become more efective - ofering clearer insights into gender trends within science fields. Moreover, techniques based on networks might reveal subtle influence structures that standard bibliometric models often miss. Building and applying this framework would allow fairer, broader evaluations of gender imbalances in scholarly work while supporting progress toward balanced assessment systems.

The way scientists work together shapes how much they publish and how often their work gets cited [ 6 ]. Yet women in research still face barriers that slow down careers, reduce visibility, or limit chances to advance [ 7 ]. Thanks to richer publication records and better tools for mapping collaborations, we now need smarter approaches to study how gender afects both citations and output [ 6 ].

Although standard citation methods are useful [ 8 ], they usually miss how deeply connected researchers really are. By using network models, we can see unseen links and power layers among science groups. Findings suggest teams with both men and women gain more attention through citations compared to those made only by women; even so, publications led by men still appear most in top-ranked journals [ 9 ]. These results highlight why fresh techniques are needed to examine unequal recognition based on gender across scholarly work. Some similar suggestions support using blockchain to track not only direct references but also how widely an idea spreads through linked papers - providing clearer, more verifiable data than standard measures [10].

In Kazakhstan, although research production is rising, structural issues remain; thus, using a networkbased method to study gender gaps in citations may ofer useful findings. Past work shows men frequently occupy key spots in scholarly networks - this tends to boost their citation numbers [11]. In contrast, women usually form tighter collaborative circles, potentially supporting lasting research growth even if short-term citations lag behind [ 5 ].

The originality of this work comes from combining gender studies with network analysis, enabling a closer look at systemic imbalances in scholarly output. Instead of standard techniques, it focuses on connections and positions inside research networks, ofering broader insight into unequal outcomes by gender. Through uncovering specific structural traits that afect how often papers are cited, the study supports creating measures designed to improve fairness between genders in higher education.

Additionally, it can be noted that other studies also report lower citation rates for female research team. For example, in high-impact medical journals, articles in which both the first and the last authors are women receive approximately half as many citations as works authored by men in both authorship positions [12, 13]. Overall, research highlights an ongoing ’productivity puzzle’: despite high female enrollment in education, women hold fewer top academic roles - also tending to publish less than male peers, though the diference shifts by discipline [14].

Besides adding to science, this study matters in real-world settings [15]. Because it reveals how gender shapes teamwork in research, it helps shape fairer ways to assess scholars - also guiding better academic rules [16, 17]. Results could improve how money is shared out across projects, strengthen mentorship eforts - or boost support systems within universities; leading slowly toward a system where inclusion grows naturally.

This study adds value to current work on gender fairness in science. Using network methods, it explores hidden trends in how researchers collaborate and cite one another - ofering clearer insight into who contributes what. Instead of assumptions, data shapes understanding. Findings may support better evaluation practices across scientific fields.

The main objective of this study is to develop and validate a network-based approach to assess how an author’s gender influences scientific productivity and citation impact. To achieve this objective, the following research tasks were formulated: 1. To construct a large-scale co-authorship network based on recent publication data and compute authors’ network centrality measures. 2. To analyze gender diferences in individual citation metrics while controlling for network position and research productivity.

3. To examine the efect of team gender composition on the citation impact of publications. By addressing these tasks, the study fills an important research gap by ofering an integrated analysis of gender disparities using network analytics, which has not been captured by traditional bibliometric methods.

2. Methods and tools 2.1. Data collection

In many earlier studies[18], the Microsoft Academic Graph (MAG) dataset was used, as it provided up-to-date information about scholarly publications. However, Microsoft announced in June 2021 that the Microsoft Academic Graph would be retired at the end of 2021 [19]. For this reason, the present study relies on OpenAlex, an open, continuously updated index of global scholarly publications developed by OurResearch, the same non-profit organization that created Unpaywall [ 20]. OpenAlex contains nearly all of the information required for our analysis, with the important exception of author gender. It is widely regarded as the successor to MAG and largely preserves a similar underlying schema [21].

OpenAlex represents scholarly communication using six core elements - works, authors, institutions, venues, concepts, and sources - all connected in a network structure useful for deep data analysis. In this research, focus lies mainly on works, authors, institutions, and venues because these support building both citation and joint-author networks. The works category holds rich details like IDs, article names, summaries, release years, how often cited, and referenced studies. These features make it possible to map who cites whom, spot key articles, and calculate metrics including overall citations or adjusted influence by subject area. Also, each publication includes an author list with standardized info on contributors and where they’re based, which helps trace cooperative links across academia.

The authors object in OpenAlex expands options for analyzing individual researchers - providing consistent names alongside distinct IDs, ORCID connections, institutional timelines, numbers of publications, along with total citations. Such detailed data supports precise separation of similar author identities; this accuracy matters when assigning gender or tracking research activity. Instead of relying solely on name matching, the work combines scholar details with outside tools like Genderize.io or Gender API to estimate researcher gender. As a result, it becomes possible to link personal profiles to publishing history and citation patterns. Thus, diferences between genders in productivity, influence via citations, and positioning within collaboration networks can be examined per field.

To keep data size under control and maintain even time coverage without losing analysis quality, the OpenAlex database was retrieved automatically using its open REST API [22]. Given that the complete OpenAlex collection includes many millions of entries, pulling everything at once would have made network modeling too slow or impossible. Instead, a method combining random selection with layering by year was applied. From every year between 2021 and 2025, around 10,000 records were drawn at chance [23] - but only those marked as journal papers. By doing this, the resulting set stayed reflective of broader trends while staying small enough to process eficiently - supporting solid statistical work and connection mapping without creating overly dense networks or high memory use.

In the data collection phase, API calls were set up to find records where the publication type was marked as a journal article; also required was proper author information including clear author IDs. Another condition involved having citation counts above zero - this ensured cited work was included. Afiliation details had to be present too, so only entries with institutional links passed the filter.

Each selected entry had enough details to build clear author links and citation maps. To gather them, custom Python tools moved step by step through API pages, collecting data yearly. Info like paper name, date, writers, institutions, citations, and cited works went into an organized setup ready for later processing.

Dataset size without gender inference is 47314 records of publications and 355193 records of authors.

= (, , ) where meant all authors, showed which ones collaborated, while assigned positive values reflecting collaboration intensity. If two researchers, say and , wrote at least one paper together from 2021 to 2025, then a link , connected them.

To reduce the oversized impact of papers with many authors, edge strengths were calculated by splitting credit equally. When a paper has contributors, every unsorted pair (i,j) from that work was assigned a share instead

2.2. Network construction

Once data gathering and cleaning were done, a co-authorship network was built to show how scientists collaborate with each other. Nodes in an undirected weighted graph stood for individual authors: (1) (2) (3) () =

1 − 1

∈ − 1 = ∑︁ 1 , The total connection strength between two scientists i and j was calculated by adding up their shared parts across all papers they wrote together where means all papers written together by authors and . Because of this setup, repeated teamwork creates more weight in connections; however, contributions from huge research groups are downweighted - this helps balance diferences between disciplines and typical team sizes when making comparisons.

The built network included around 261,000 separate researchers linked by nearly 4 million symmetric connections that had assigned weights - totaling about 4.6 million in combined strength. Although each connection stands for a specific co-author pair, these are saved in coauthor_edges_2020_2025.csv using three fields: one for starting researcher, another for collaborating partner, plus a third showing intensity. Since the structure reflects actual cooperation patterns pulled only from journal publications found in OpenAlex, it ensures uniform data layout alongside stable credit attribution over time. Based on those edges, we created a person-focused node dataset stored as author_nodes_fast.csv. This table shows, per researcher, how many distinct partners they’ve worked with, the overall strength of those ties, also key network roles like average distance to others, bridging capacity, link importance, in addition which group they belong to. Such indicators reflect the scale and density of a scholar’s immediate co-authorship circle, their nearness within the structure, or possible impact via links to well-connected peers.

The indicators were computed using a custom C++ program, fast_graph_metrics.cpp. The program reads the weighted edge list together with the complete list of known authors, maps author identifiers to contiguous integer indices, and constructs an adjacency-list representation in memory to enable linear-time traversals. During preprocessing, self-loops are removed and duplicate edges are merged so that only a single undirected link is retained for each author pair. Connected components are identified via a depth-first search procedure, and the size of every component is recorded while ensuring that isolated authors are also preserved in the output.

Network metrics are obtained using a combination of exact and scalable approximation algorithms. Harmonic closeness is estimated by multi-source breadth-first search starting from a fixed set of pivot nodes distributed across the graph, whereas betweenness centrality is approximated using a samplingbased variant of the Brandes algorithm with a predefined number of source nodes. Eigenvector centrality is computed on the largest connected component via iterative power iteration until convergence. All algorithms are parallelized with OpenMP and rely on fixed random seeds to guarantee deterministic and fully reproducible results. The program produces two main outputs: author_nodes_fast.csv, containing all computed metrics for each author, and component_summary_fast.csv, summarizing the size and density of every connected component. The enriched author table used in subsequent stages of the analysis is obtained by joining these network metrics with bibliometric indicators such as publication counts and citation totals. This yields a mathematically consistent, reproducible, and computationally eficient representation of global scholarly collaboration that forms the analytical foundation for the remainder of the study.

2.3. Graph computation

The calculation of graph metrics took place on the built co-authorship network to measure researchers’ structural positions. Every node stands for an author; meanwhile, each undirected edge with weights shows joint work, where higher values point to more frequent cooperation. Instead of simple links, edges carry numerical strength based on recurring teamwork. Various centrality and influence scores were derived to outline each scholar’s standing inside this web. These measures reflect immediate connections as well as broader reach across the system - helping examine how collaborative patterns connect to output volume and citation results.

Key metrics are: Degree centrality. Degree centrality () shows how many immediate links an author has, indicating the extent of their collaboration network - using ties to signal engagement scope: () = number of distinct coauthors of .

Harmonic closeness centrality. Harmonic closeness centrality ( ) shows a scholar’s average proximity to others in the structure, giving more weight to nearer connections while still working when parts of the network are isolated: Weighted degree (strength). Weighted degree or strength () generalizes this idea by including how often and how strongly actors cooperate - using frequency alongside interaction intensity instead of just counting links: () = ∑︁ , where denotes the weight of the edge between authors and . (4) (5) (6) (7) where is the total number of shortest paths between nodes and , and () is the number of those paths that pass through . where is the set of all nodes (authors) and denotes the length of the shortest path between authors and .

Betweenness centrality. Betweenness centrality () shows how much a researcher acts as a bridge - connecting others within collaboration networks. This measure highlights individuals who link separate groups, facilitating information flow across disjoint parts of the network: () =

1 | | − 1 ∑︁ 1 , ∈ ̸= () = ∑︁ ,∈ ̸=, ̸=, ̸= () ,

Eigenvector centrality.

Eigenvector centrality ( ) assesses how influential an author is - giving greater weight if they collaborate with highly central peers. It is defined by the principal-eigenvector equation: () =

1 ∑︁ ( ), (8) where is the element of the adjacency matrix representing the connection between and , and is the largest eigenvalue associated with the network.

To enable shortest-path centralities on graphs of this size, the approach applies bounded approximations that maintain core mathematical properties while lowering computational load. Instead of full enumeration, harmonic closeness is computed using inverse distances gathered through multi-source BFS starting from 256 strategically placed reference points spread throughout the network. For betweenness, a stochastic variant of Brandes’ method limits processing to 512 randomly chosen origin nodes. As a result, runtime shifts from a near-cubic demand under exact calculation to a nearly linear relationship involving edge count and sample size - still reliably ranking top-central vertices. The leading eigenvector of the adjacency matrix, corresponding to eigenvector centrality, is extracted by iterative multiplication within the main connected subgraph, enforcing tight error thresholds plus a cap on repetition cycles.

All measures rely on adjacency lists created straight from the weighted edge list, meaning the network is treated as an undirected graph with just one link per author duo. While building it, loops connecting a node to itself are dropped; repeated links between identical pairs get combined through weight addition - keeping overall connection intensity intact but reducing complexity. Next, a depth-first traversal splits the structure into linked clusters and logs how large each cluster is, giving every researcher a group label. As a result, lone or marginally attached individuals stay included, plus centrality scores reflect accurate local topology inside the co-authorship layout.

Parallelization cuts time by splitting source nodes across CPU threads, merging local outcomes through atomic operations. Because of this, processing duration grows nearly in proportion to edge count - making it possible to handle such graphs on one machine in just hours; reproducibility is ensured using set random seeds, ordered iterations, and consistent rounding. Two verified files emerge: a detailed node file, listing centralities and group tags, along with a grouped overview, showing cluster sizes, connections, and density values - all enabling reliable, fast groundwork for later gender analysis and pattern-based statistics.

2.4. Gender inference

The last phase of data enhancement used automated categorization via Gender-API to estimate author gender. This step aimed to attach a likelihood-based gender tag to every researcher, allowing comparisons between women and men regarding teamwork habits, output levels, or influence measured by citations. Instead of combining multiple tools, only Gender-API was applied due to broad worldwide name recognition, consistent reliability across areas, along with ofering measurable certainty values per result. Its system relies on an evolving database drawn from government records, population registers, also openly confirmed personal accounts - making it well-suited for scholarly data containing varied non-Western names.

The procedure started by pulling and

standardizing first names of authors from author_nodes_fast_enriched.csv. Punctuation was removed from each name, then transformed into basic Latin characters, leaving just one word per given name. For better speed and adherence to request limits, data chunks - each holding no more than ten thousand entries - were sent using async API queries. Retrieved results were saved locally in gender_api_cache.csv, allowing later executions to skip fresh calls if the name had already been assessed. Every reply from the service carried three pieces: a gender guess ("male", "female", or "unknown"), a likelihood value between 0 and 1, plus location details applied during analysis when provided.

To ensure both precision and broad representation, a dual-phase selection method was used. First, only those authors scoring at least 0.6 in reliability were accepted as confirmed entries, supporting robust data analysis. Gender was inferred only if API confidence ≥ 0.6 to ensure reasonable accuracy. Next, records falling under that level were labeled "unknown" - left out of gender-specific calculations yet kept in the full record for openness. The process aligns with standard research methods, recognizing limits in predictive identification while keeping results comparable across diferent naming backgrounds. Typically, Gender-API delivered strong certainty ratings for over 75% of authors, resulting in a final group of around 206 000 classified people.

The verified gender data was linked to the node table, producing author_nodes_with_gender.csv. This version adds three new columns per author: gender (either male or female), gender_accuracy (a number showing confidence level), also gender_source. Each entry keeps its original author ID along with existing network measures, allowing combined study of gender together with collaboration role and output levels. As a result, this updated set ofers a clear, repeatable connection between gender traits and key structures in the worldwide research network.

2.5. Models and statistical tests

The study looked at gender gaps in science collaboration by comparing network stats for men and women. Once the co-author links were mapped and tagged by gender, every researcher became a set of structural traits, that contains degree, weighted degree, harmonic closeness, betweenness, and eigenvector score. Together, these metrics show roles within the network: degree shows amount of unique partners; weighted degree accounts for multiple papers with the same collaborators; harmonic closeness indicates how close one is to others across the network on average; betweenness highlights those who connect separate groups; while eigenvector centrality reveals influence based on ties to already central figures.

To ensure mathematical consistency, variables were adjusted via -score normalization - each metric was shifted to have a mean of zero and variance of one. As a result, measures from difering scales could be compared directly. Cases with uncertain gender labels were left out of statistical inference yet kept in the complete data for summary reporting. In this filtered set, records were sorted by gender; then, within groups, key summaries - including mean, median, first and third quartiles, standard deviation, and skewness - were calculated, ofering a basic view of how network traits difer between male and female scholars.

Besides showing strong skewness and extended upper tails, observed centrality patterns in big team-up graphs led to using rank-based techniques. Instead of comparing mean, disparities between female and male contributors were checked through the Mann–Whitney method, preserving validity without normal shape assumptions. Regarding every central measure, no real gap was assumed under 0, whereas 1 suggested consistent divergence. Alongside reported U values, two-tailed probabilities appeared, marking significance whenever reached or dropped below 0.05. In order to limit false positives when testing several indicators at once, threshold adjustments followed the Bonferroni rule.

To study the data layout, boxplots along with kernel density curves were created per measure, allowing clear views of changes in medians, spread, and outliers tied to top-ranked researchers. Plots used matching scales so indicators could be compared easily. Correlation tables were also calculated split by gender - to check if links between centrality scores varied across genders; for example, whether degree connects more strongly with eigenvector centrality in either male or female, hinting at difering collaboration patterns.

The next step looked at if gender links to variations in how much researchers produce or how often their work gets cited. Based on data already included in the author summary table, output was checked using publication count along with the Hirsch index ℎ, calculated during the observed period; meanwhile, citation influence used overall citations tot combined with mean citations per article avg. In combination, these metrics reflect both quantity and reach of academic work while giving overlapping yet distinct views on research results.

Prior to analysis, every continuous measure underwent log-transformation - reduce skewness and the influence of extreme outliers. For every original value , its adjusted form ′ was calculated using ′ = ln(1 + ), (9) Here, ln(·) means the natural log. For men and women authors, summary stats - like average, middle value, spread between quarters, and variation size - were calculated on their own to give a basic picture.

In examining network indicators, gender-based variations in output and citations were checked via the Mann–Whitney test - a method suited for skewed citation patterns since it doesn’t assume normal distribution. Tests used a two-tailed approach with = 0.05 , while correction methods accounted for repeated testing across variables.

The final step used regression analysis to measure how gender together with structural factors afect citation impact. Separate but related models were applied - one focusing on authors, another on publications - to reflect diferent perspectives. OLS regression was selected because it clearly shows results when outcomes are numerical. To handle reliable inference in the presence of heteroskedasticity and non-normal residuals, all models were estimated with HC3 heteroskedasticity-robust standard errors. Analyses ran in Python via statsmodels, ofering precise adjustments for error types, fixed variables, and testing coeficients.

At the author level, the dependent variable was defined as the log-transformed citation count. For each researcher ,

= log(︀ 1 + citations)︀ , where citations is the total number of citations received by author .

Independent variables included gender (dummy coded with female as the key category), productivity indicators, and the full set of network centrality measures derived earlier. All continuous predictors were standardized using -score normalization to place them on a comparable scale and to facilitate interpretation of coeficients as efect sizes. The inclusion of both productivity and centrality variables allowed the model to separate the efects of output quantity from those of structural position. Categorical variables representing gender and unknown classifications were encoded as binary indicators.

The general model specification can be expressed as log(︀ 1 + citations)︀ = 0 + 1 Female + 2 Unknown + 3 NumPapers + 4 Degree + 5 HarmonicCloseness + 6 Betweenness (11) + 7 Eigenvector + . (10) (12) where denotes the model residual. Diagnostics confirmed that the inclusion of standardized predictors mitigated issues of multicollinearity, and influence statistics were monitored to detect potential leverage points. The resulting model explains individual citation performance as a function of both gender and structural position within the co-authorship network.

At the publication level, a second regression model was designed to estimate how team composition and gender diversity influence the citation impact of individual papers. The dependent variable was defined as the log-transformed citation count. For each work ,

= log(︀ 1 + citations)︀ , where citations denotes the total number of citations received by paper .

Independent variables included the share of female authors within the team, the total number of coauthors, the squared team size to capture diminishing returns of collaboration, and indicators for mixed-gender teams and gender diversity. Diversity was operationalized through the Blau index, defined as one minus the sum of squared gender proportions within the team. Year and journal (venue) fixed efects were added to control for temporal variation and disciplinary diferences in citation behavior.

The model specification can be written as log(︀ 1 + cit)︀ = 0 + 1 FemRatio + 2 TeamSize + 3 TeamSize2 + 4 MixedTeam + 5 DiversityIndex + YearFE + VenueFE + . (13) where represents the error term. Both models were estimated on the full samples of authors and works, respectively. Residual plots and multicollinearity diagnostics confirmed the validity of OLS assumptions under robust standard errors. All coeficients were interpreted in semi-elastic terms, meaning that each unit change in a standardized predictor corresponds to a proportional change in expected citation impact on the log scale.

This method combines individual output, teamwork patterns, and group makeup in one statistical model to examine citations, adjusting directly for gender. Using scaled variables, corrected error terms, because of fixed factors, makes findings consistent, clear, reliable at each analytical level.

3. Results 3.1. Results

The evaluation of research output and citation rates reveals gender-based diferences that are statistically significant, yet minimal in real-world impact[ 24][25]. Since publication and citation data tend to cluster at lower values with a few high outliers, medians alongside means are presented; comparisons between men and women rely on non-parametric methods accordingly.

Male researchers tend to publish somewhat more than females when averaging across cases (mean = 1.3375 vs. mean = 1.2491); however, both show the same middle value. Instead of difering in usual productivity, disparities appear mostly at the higher end - men make up a greater portion of top-publishing individuals. A Mann–Whitney test detects this gap as highly unlikely under random chance ( = 6.87 × 10 −88 ), though the magnitude remains tiny ( = 0.0280), suggesting almost no practical distinction between the two groups’ overall publishing levels. Note: Network metrics are computed from co-authorship relationships.Sample comprises 269,205 authors from scientific collaboration networks (2021-2025). Component size is the size of the connected component in the collaboration network.

The findings in Table 2 suggest men publish slightly more than women (average 1.3375 vs. 1.2491; median = 1 for both), yet citation rates at the individual level are similar between genders based on median values. Instead of meaningful gaps, observed significance often comes from large sample sizes together with highly skewed (right-tailed) citation distributions; real-world disparities remain minor and appear to be driven mainly by a higher concentration of extreme high-output cases among male authors. Given these patterns, the subsequent models examine whether collaboration networks or team composition help explain diferences in citation outcomes after accounting for output volume. Note: Two-tailed Mann-Whitney U tests compare male (N=131,517) vs female (N=75,295) authors. Efect size = ||/√ . Thresholds: negligible ( < 0.10), small (0.10–0.30), medium (0.30–0.50), large ( ≥ 0.50 ). Direction indicates which group has the higher median. Significance: *** < 0.001, ** < 0.01, * < 0.05.

Taking the collaboration structure into account, network data reveal minor yet notable gender-related patterns. Women tend to occupy slightly more central roles across multiple indicators. As noted in Table 3, degree centrality favors women (median 12 vs. 11; average 30.19 vs. 29.52; < 0.001, = 0.0248). Weighted connections also appear denser for women (median 13 vs. 11; mean 33.89 vs. 33.53; < 0.001, = 0.0237). On closeness, women again rank higher (median 0.0799 vs. 0.0770; < 0.001, = 0.0270). Eigenvector centrality follows the same direction (median 5.69 × 10 −13 vs. 7.58 × 10 −14 ; < 0.001, = 0.0315). However, for betweenness, both genders have a median of zero, although the male mean is higher (4.07 vs. 2.46), suggesting that brokerage roles are concentrated among a small subset of authors. Component size has identical medians (163,977) and difers only trivially on average ( = 0.01195, = 0.0048). Overall, these outcomes point to subtle diferences in network placement rather than deep gender-based divides in collaboration structures.

The study uses author-level regression to test whether gender gaps in citations persist after adjusting for personal output and collaboration patterns. Citation impact is modeled as log(1 + total citations), which reduces the influence of extreme values and makes coeficients interpretable as approximate percentage changes in expected citations.

To assess stability and clarify what drives the gender efect, we estimate several model specifications. In Table 4 Model 1 is a starting point that includes gender and output measures only. In this baseline specification, the female coeficient is close to zero and not statistically significant ( = 0.004 , = 0.729), indicating that a gender gap is not observed when network position is not considered.

We then re-estimate the models with additional controls capturing authors’ roles in the co-authorship network (Models 2–5). Once network factors are included, a modest but consistent disadvantage for women appears. In the preferred parsimonious specification (Model 2), female authors have = −0.056 ( < 0.001), corresponding to approximately 5.5% fewer citations than comparable male authors with similar output and collaboration profiles. The shift from Model 1 to network-adjusted models serves as an indirect check on mechanism: if inequality were primarily due to diferential access to collaborators, controlling for network ties would be expected to reduce the gender coeficient. Instead, the estimate becomes more negative, which is more consistent with unequal returns to comparable connections Note: DV: log1(total citations). All continuous predictors are standardized (z-scores). HC3 heteroskedasticity-robust SEs. p-values in parentheses; percentage change in citations [exp( )−1 ] in brackets. *** < 0.001, ** < 0.01, * < 0.05. Model 2 is recommended (no severe multicollinearity; VIF< 2). Model 5 has high VIF due to correlated centrality measures. rather than diferences in who is connected.

Network position is strongly associated with citation impact. In Model 2, degree centrality is a particularly strong predictor ( = 1.106 , < 0.001), implying substantially higher citation counts for more well-connected authors. By contrast, in Table 5 betweenness and eigenvector centrality enter with negative coeficients once degree is controlled for, suggesting overlap among these measures and indicating that, conditional on direct connectedness, brokerage or eigenvector-based prominence does not add citation advantage in this specification.

Across alternative specifications, the female penalty remains present, while the most saturated model exhibits substantial multicollinearity, making the simpler specification more reliable for interpretation. Overall, the author-level results point to a nuanced pattern: no gender gap is visible when only output is considered, but a small disadvantage emerges once collaboration structure is held constant, consistent with diferences in recognition despite broadly comparable network involvement.

The work-level analysis tests whether the gender composition of author teams is associated with citation impact at the paper level. The dependent variable is modeled as log(1 + citations). Team composition is captured using the proportion of female authors, an indicator for mixed-gender teams, and Blau’s diversity index. Team size is included both linearly and quadratically to allow for diminishing returns. Year fixed efects account for time-related shifts in citation accumulation, and venue fixed Note: N=269,205. DV: log1(total citations). HC3 robust SEs. Percentage change computed as [exp( )−1 ]× 100. All continuous predictors standardized (mean=0, SD=1). Model R2=0.246 (Adj R2=0.246); all VIF< 2. Note: N=47,314 papers. DV: log1(citation count). HC3 robust SEs. Percentage change: [exp( )−1 ]× 100. Model R2=0.415. Female ratio is computed among known genders (0–1). Blau index: 1 − ∑︀ 2 across male/female/unknown (range 0 to 0.667). *** < 0.001. efects control for systematic diferences in baseline visibility across publication outlets.

The results in Table 6 show a clear and robust relationship between team composition and citations. A higher female proportion is associated with lower expected citations ( = −0.836 , < 0.001); moving from an all-male (0) to an all-female (1) team corresponds to approximately 56.7% fewer citations, holding constant team size, year, and venue. In contrast, mixed-gender teams exhibit a positive association with citation impact ( = 0.270 , < 0.001), corresponding to roughly 30.9% more citations than non-mixed teams. Gender diversity, as measured by Blau’s index, is strongly positive ( = 2.648 , < 0.001), indicating that more gender-balanced teams are associated with substantially higher citation impact.

Team size follows an expected pattern: larger teams as declared in Table 6 receive more citations on average ( = 0.140 , < 0.001), while the negative quadratic term ( ≈ −0.001 , < 0.001) indicates diminishing marginal returns as teams become very large. Year indicators for recent publication years (e.g., 2024 and 2025) are sharply negative, consistent with the fact that newer papers have had less time

Males publish marginally more (mean +7%), but individual- Strong level citation impact is close to gender-neutral: female median citations are slightly higher (146 vs 145). Efect sizes for citation metrics are negligible ( ≈ 0.005).

RQ3a: Network → cita- Female penalty is absent without network controls ( = tions (author) 0.004, ns) but emerges when network position is controlled ( = −0.056 to −0.084 , < 0.01). Degree centrality strongly predicts citations (+202% per SD), but returns appear weaker for females.

RQ3b: Team composi- Female-heavy teams receive substantially fewer citations (- Strong tion → citations (work) 57% for all-female vs all-male). Mixed-gender teams show a diversity bonus (+31%), and Blau diversity is strongly positive (+1313% from 0 to max).

The paradox

Despite comparable or stronger network integration, female authors and female-heavy teams are under-cited. The gap appears in how work is received (citation practices) rather than how collaborations form (network structure). Diversity, rather than female presence alone, is most strongly associated with higher impact.

Support Strong Strong Strong Note: Support indicates strength of statistical evidence and robustness across specifications. to accumulate citations. Venue fixed efects further ensure that diferences driven by outlet visibility are not attributed to team composition.

Overall, the work-level results suggest that team gender composition is meaningfully related to how research is received in citation practices. Female-dominated teams experience a substantial citation disadvantage, whereas mixed and more gender-diverse teams tend to achieve higher impact, net of team size and publication context. These findings complement the author-level analysis by shifting attention from individual network position to team configuration and its association with citation outcomes.

This table 7 shows the summary of research questions and corresponding findings from data analysis. Also, with DV = log1(citations), interpret coeficients as:

Percent change = (exp() − 1) × 100%. (14) Examples: • Female ( = −0.056) ⇒ about -5.5% citations. • Degree ( = 1.106) ⇒ about +202% citations per 1 SD.

• Diversity ( = 2.648) ⇒ about +1313% citations (0 to max).

The ‘Network Advantage Paradox’ can be summarized as follows: 1. At baseline (Model 1), there is no female penalty ( = 0.004, = 0.73). 2. The penalty appears only after controlling for network position.

3. This suggests that returns to network position difer by gender.

3.2. Discussion and limitations

Our findings highlight two linked patterns. First, at the author level we observe a Network Advantage Paradox: there is no female disadvantage in the baseline specification (Model 1: = 0.004 , = 0.73), yet a small gap emerges once network position is controlled (Model 2: = −0.056 , < 0.001). This suggests that unequal access to collaboration networks is not the primary constraint; rather, the returns to comparable network positions difer by gender. Second, at the work level, team composition is strongly associated with citation impact: papers with a higher female share receive fewer citations ( = −0.836 , < 0.001), while mixed-gender teams show a positive premium ( = 0.270 , < 0.001), and overall gender diversity is strongly beneficial (Blau index: = 2.648 , < 0.001), net of team size, publication year, and venue. Taken together, these results motivate interventions that improve recognition for female-led research and ensure that diverse collaborations translate into visible impact.

Additionally, the dataset (sourced from OpenAlex) may contain coverage biases or missing gender information for some authors. Approximately 42% of authors in the dataset could not be gender-classified with high confidence, which may afect the generalizability of our findings. Another limitation is related to the use of Gender-API, which has its own specific constraints in author gender identification; as a result, some names may not be correctly recognized or may remain unclassified. This acknowledges a possible source of error in gender attribution. Furthermore, the study does not diferentiate between research fields and does not examine whether the observed efects vary across diferent disciplines. It can be assumed that patterns of scientific productivity may difer between authors working in the humanities and those in technical or engineering fields. However, these limitations do not undermine the main finding of a persistent gender gap, which proved robust across various tests. Given our ifndings on the benefits of mixed-gender teams, further research could focus on the qualitative aspects of collaboration, in particular on how gender-diverse teams organize their work and distribute roles in order to achieve greater scientific impact.

Based on the results of the study, the following key recommendations can be identified for leaders of group gaps in science:

Sponsor initiatives (reducing the “returns” gap). Because the female disadvantage appears only

after adjusting for network position, equal connectedness does not necessarily produce equal recognition. Sponsor initiatives go beyond mentoring by emphasizing active advocacy: senior academics nominate and promote female researchers and female-led teams for high-visibility opportunities such as keynotes, invited talks, editorial roles, program committees, and prominent collaborative projects. Rather than simply increasing ties, sponsorship is intended to convert existing network participation into acknowledgment, directly targeting the mechanism suggested by the Network Advantage Paradox.

Bridge-building grants (creating high-visibility partnerships and dissemination channels).

The work-level results show that team composition remains associated with citations even after controlling for venue and team size, with mixed and diverse teams linked to higher impact. Bridge-building grants can incentivize collaborations across institutions, countries, or subfields, while supporting female researchers as PIs or co-PIs in leading these partnerships[26]. By expanding dissemination pathways and strengthening cross-network connections, such programs aim to increase the citation returns to teamwork and mitigate the lower citation rates observed for female-heavy teams.

Diversity monitoring dashboards (detecting and correcting systematic disparities). Given that team composition is a strong predictor of paper-level impact, conferences, journals, and institutions can track aggregate indicators of visibility and recognition—for example, invited speaker line-ups, editorial boards and reviewer pools, program committee composition, acceptance patterns, and downstream citation outcomes. Privacy-preserving dashboards can help detect persistent shortfalls in recognition for female-led teams relative to comparable outputs (consistent with the author-level pattern) and assess whether benefits associated with diversity are being realized broadly and fairly.

Hybrid or online conference participation (maintaining visibility during parental leave and caregiving constraints). Citation advantages often accumulate through visibility channels such as conferences, invited lectures, and networking events. Hybrid participation options (remote talks, virtual posters, streaming, and recordings) can help maintain scholarly presence for researchers who cannot travel, including those on parental leave or with caregiving responsibilities. In light of our results, keeping access open to major dissemination venues may reduce gaps in how collaborative work translates into citations, particularly for women and female-led teams.

4. Conclusions

The results of this study confirm that gender disparities in scientific impact persist: even with comparable network positions, female researchers on average receive fewer citations, and all-female teams show a significantly lower citation impact compared to male or mixed-gender teams. We achieved our objective of developing a network-based method for assessing gender disparities. By applying this approach, we were able to quantitatively capture a previously unmeasured gap between collaboration network position and citation outcomes for male and female scientists. The obtained results provide strong evidence for shaping research evaluation policies, in particular for supporting initiatives aimed at promoting gender-balanced collaboration and fair recognition of all researchers’ contributions. We believe that the evidence and recommendations presented here can serve as a valuable guide for future policies and interventions aimed at strengthening gender equality in academia.

Declaration on Generative AI

The authors have not employed any Generative AI tools. [10] D. Lukianov, K. Kolesnikova, A. Mussurmanov, R. Lisnevskyi, V. Lisnevskyi, Using blockchain technology in scientometrics., in: DTESI (workshops, short papers), 2023. [11] A. Biloshchytskyi, A. Kuchansky, Y. Andrashko, M. Gladka, Impact of gender on publication productivity and scientific collaboration, in: 2022 International Conference on Smart Information Systems and Technologies (SIST), IEEE, 2022, pp. 1–4. [12] O. Kuchanskyi, Y. Andrashko, A. Biloshchytskyi, S. Omirbayev, A. Mukhatayev, S. Biloshchytska, A. Faizullin, Gender-related diferences in the citation impact of scientific publications and improving the authors’ productivity, Publications 11 (2023) 37. [13] P. Chatterjee, R. M. Werner, Gender disparity in citations in high-impact journal articles, JAMA network open 4 (2021) e2114509–e2114509. [14] V. Larivière, C. Ni, Y. Gingras, B. Cronin, C. R. Sugimoto, Bibliometrics: Global gender disparities in science, Nature 504 (2013) 211–213. [15] M. M. King, M. E. Frederickson, The pandemic penalty: The gendered efects of covid-19 on scientific productivity, Socius 7 (2021) 23780231211006977. [16] H. F. Chan, B. Torgler, Gender diferences in performance of top cited scientists by field and country, Scientometrics 125 (2020) 2421–2447. [17] H. K. Baker, N. Pandey, S. Kumar, A. Haldar, A bibliometric analysis of board diversity: Current status, development, and future research directions, Journal of business research 108 (2020) 232–246. [18] J. Li, Y. Yin, S. Fortunato, D. Wang, Scientific elite revisited: Patterns of productivity, collaboration, authorship and impact, Journal of the Royal Society Interface 17 (2020) 20200135. [19] Microsoft Research, Next steps for microsoft academic - expanding into new horizons, 2021. URL: https://www.microsoft.com/en-us/research/articles/ microsoft-academic-to-expand-horizons-with-community-driven-approach/. [20] J. Priem, H. Piwowar, R. Orr, Openalex: A fully-open index of scholarly works, authors, venues, institutions, and concepts, arXiv preprint arXiv:2205.01833 (2022). [21] T. Scheidsteger, R. Haunschild, Which of the metadata with relevance for bibliometrics are the same and which are diferent when switching from microsoft academic graph to openalex?, Profesional de la información 32 (2023). [22] R. Harder, Using scopus and openalex apis to retrieve bibliographic data for evidence synthesis. a procedure based on bash and sql, MethodsX 12 (2024) 102601. [23] M. Kumar, R. J. George, A. PS, Bibliometric analysis for medical research, Indian Journal of

Psychological Medicine 45 (2023) 277–282. [24] N. Wahid, N. F. Warraich, M. Tahira, Factors influencing scholarly publication productivity: a systematic review, Information Discovery and Delivery 50 (2022) 22–33. [25] M. Hamerka, Identification of scientific productivity determinants, International Journal for

Quality Research 14 (2020) 559. [26] M. L. O. Olivo, R. A. Oluwakemi, Z. Lakner, T. Farkas, Gender diferences in research fields of bioeconomy and rural development-based on sustainable systems in latin america and africa regions, Plos one 19 (2024) e0308713.

[1]

C. B.

Fell ,

C. J.

König , Is there a gender diference in scientific collaboration? a scientometric examination of co-authorships among industrial-organizational psychologists , Scientometrics 108 ( 2016 ) 113 - 141 .

[2]

M. C.

Martini ,

Pelle ,

Poggi ,

Sciandra , The role of citation networks to explain academic promotions: an empirical analysis of the italian national scientific qualification , Scientometrics 127 ( 2022 ) 5633 - 5659 .

[3]

Bailie ,

Matous ,

Bailie ,

M. E.

Passey , Patterns of collaboration and knowledge generated by an australian rural research centre over 20 years: a co-authorship network analysis , Health Research Policy and Systems 21 ( 2023 ) 87 .

[4]

Jafe ,

E. Ter

Horst ,

L. H.

Gunn ,

J. D.

Zambrano , G. Molina, A network analysis of research productivity by country, discipline, and wealth , Plos one 15 ( 2020 ) e0232458 .

[5]

Fagan ,

K. S.

Eddens ,

Dolly ,

N. L.

Vanderford ,

Weiss ,

J. S.

Levens , Assessing research collaboration through co-authorship network analysis , The journal of research administration 49 ( 2018 ) 76 .

[6]

Huang ,

A. J.

Gates ,

Sinatra ,

A.-L.

Barabási , Historical comparison of gender inequality in scientific careers across countries and disciplines , Proceedings of the national academy of sciences 117 ( 2020 ) 4609 - 4616 .

[7] A. De Nicola , G. D'Agostino, Assessment of gender divide in scientific communities , Scientometrics 126 ( 2021 ) 3807 - 3840 .

[8]

T.-W.

Chien ,

J. C.

Chow ,

Chang ,

Chou , Applying gini coeficient to evaluate the author research domains associated with the ordering of author names: a bibliometric study , Medicine 97 ( 2018 ) e12418 .

[9]

Maddi ,

Gingras , Gender diversity in research teams and citation impact in economics and management , Journal of Economic Surveys 35 ( 2021 ) 1381 - 1404 .