How should firms share their data?

March 18, 2025 Digital

As businesses and policymakers navigate the Big Data economy, TSE Digital Center offers experienced guides to a sprawling labyrinth filled with dangerous traps and glittering opportunities. For a new study, Yassine Lefouili and Andrea Mantovani teamed up with experts at the European Commission’s Joint Research Centre in Seville to explore how firms should pool their data resources.

Why do we need to coordinate the use of data?

Rapid technological advances in our ability to create, access and process data promise huge efficiency gains. However, unlocking the value of data for many applications – such as predicting the evolution of cancer, crop yields and market dynamics, or optimizing product design – requires efficient collaboration and integration of data from multiple sources.

Consequently, policymakers are setting up policies and experimenting with technologies to help firms to combine their data resources. For instance, the European Commission is developing Common European Data Spaces to facilitate data combination in several strategic sectors such as healthcare, energy, mobility, finance, manufacturing and agriculture. 

How does data sharing differ from analytics sharing?

Data can be combined in various ways, and specialized platforms increasingly help firms in this endeavor by either prioritizing data access or delivering analytics services. For example, Snowflake is a cloud-based platform primarily focused on data sharing and warehousing, whereas Databricks specializes in providing Big Data analytics, machine learning, and advanced analytics workflows. 

Figure 1: Data pipeline with Snowflake technology as part of it.

Source: snowflake.com

Figure 2: Data pipeline and Databricks services.

Source: databricks.com

For our paper, we built a model in which a platform can opt for a data-sharing or an analytics-sharing technology. Under data sharing, participating firms obtain direct access to all contributed data. This enables them to perform their own analytics using a richer dataset that can also draw on any data they choose not to contribute.

In contrast, analytics sharing involves firms submitting their data to a platform that only returns analytics results from the combined dataset. The value of these insights increases with the amount of data contributed. This method enhances privacy and security, as raw data is not exchanged between firms. 

What are the key factors that influence a platform's technology choice?

Our analysis shows that the platform opts for analytics sharing only if it guarantees sufficiently higher data security. The more data firms possess, the higher this security advantage needs to rise for analytics sharing to be preferred over data sharing. The decision hinges on balancing the benefits of comprehensive raw data against the need to safeguard sensitive information by reducing data transmission and points of access.

How do these technologies impact the overall amount of contributed data?

We found that analytics sharing leads to higher total data contributions under fairly reasonable conditions. A key mechanism is that, while data sharing allows firms to benefit from others' data without contributing their own, analytics sharing requires firms to contribute data in order to benefit from the combined dataset. The enhanced security and privacy of analytics sharing also fosters collaboration as firms are more comfortable sharing data if they know it will not be directly accessed by others. 

Which data-combination technology is more beneficial for consumers?

The optimal choice varies. If data is primarily used to enhance product quality, consumers align with the platform in preferring analytics sharing, provided this technology offers a sufficient security advantage. However, if data is used to personalize pricing, consumers may prefer data sharing because it can limit price discrimination. 

What are the implications for managers?

The choice of a data-combination technology is a delicate one and potentially hard to reverse. Managers must weigh their firm's data sensitivity against the efficiency benefits of combined data insights. If data security is paramount, analytics sharing may be preferable. 

Firms with rich data resources may be inclined towards data sharing to maximize the value derived from their own data. However, investment in the capabilities to perform in-house analytics can be particularly costly for firms with limited ICT skills. 

In addition, managers may prefer the potential of analytics sharing to maximize the amount of data available. This is particularly relevant if the firm’s business model benefits from the volume of data it holds and processes. 

The nature of contracts can also play a role. We find that public uniform contracts make analytics sharing more attractive, while personalized secret contracts favor data sharing.

Which policy recommendations emerge from your study?

To strengthen privacy protection while fostering data combination, policymakers should adopt measures that improve the data security of analytics-sharing technologies. When these technologies fail to benefit consumers, policies that promote analytics sharing should be complemented by measures to prevent data leaks and price discrimination. 

KEY TAKEAWAYS 

• Platforms prefer analytics sharing when it offers significantly greater data security than data sharing.

• Data sharing allows firms to exploit their own non-shared data, but also requires costly investment in capabilities to perform in-house analytics. 

• Analytics sharing encourages higher overall data contributions.

• When data is used for improving products, consumers prefer analytics sharing. 

• If firms use data for price discrimination, consumers may be better off with data sharing.

Public uniform contracts make analytics sharing more attractive.

 

FURTHER READINGData sharing or analytics sharing?’ and other publications by Yassine and Andrea are available to read on the TSE website.


Article published in TSE Reflect, March 2025