Centre on Institutional Data CID

The CID is responsible for providing institutional (contextual) data for the different data sets that may be used for scientific research. It will be central to the aim of linking different types of election studies to actual electoral outcomes. The proposed host for the CID is the Center for Socio-Political Data at Sciences Po Paris.

This is a component currently contributed to separately by many of the individual projects, although the contributions of different projects differ in terms of standards and coverage.

MEDem will attempt to simplify and standardize the collection and distribution of institutional and contextual data to its various projects. It will also link to existing projects by gathering more fine-grained comparative data on matters such as campaign conduct, electoral systems and districting principles, or media regulations. Providing a separate Competence Centre to undertake this task will relieve other projects of the need to do so while ensuring the standardization of data used in virtually all of the research undertaken with MEDem data.


The centre will:

  • Consult with comparative projects involved in the collection of survey and textual data, as set out above;

  • Promote the usability of contextual data across datasets collected in the different projects, especially for data that exist in different languages;

  • Document the methodological and technical approaches for collecting contextual data from a variety of sources;

  • Ensure that all the contextual data is well documented and updated following a joint metadata and documentation standard, and archived in established data archives if not archived at the competence centre itself;

  • Contribute to the development of post- and pre-harmonization procedures and standards for previous and future contextual data collections;

  • The CID may also be involved in the preparation of the European Election Study (EES).


The CID is the unit responsible for providing contextual data on electoral democracies for the different projects under the MEDem umbrella. It involves consultation with projects within the MEDem scope and outside of it, promote the use of contextual data, document the data and the approaches for collecting, documenting and using contextual data, develop post and pre-harmonization procedures and standards.

There are indeed different types of data reflected by the category of Institutional/contextual data:

  • Electoral outcomes (down the more precise geographical unit possible, including by-elections)

  • Macro socio, economic and political indicators (from World bank, OECD, Eurostat,…)

  • Electoral regulations (from electoral systems, districting principles to media and campaign regulations)

  • Behavioral measurements (campaign conduct for instance is cited).


The CID suggests to extend this definition, by including, one the one hand, “parliamentary behavior” (from roll call records of individual MPs to a number of indicators of legislative activities (such as questions, amendments, participation and activities of committees,…) and, on the other hand, the sociology of the elite (starting with at least gender and profession of MPs and MEPs).

The reason for these proposed extensions are substantial (because both could be significant contributions to the actual monitoring of electoral democracy) but also methodological (as the rationale here is the collection of comparative of “natural” data (i.e. neither experimental nor based on interviews). Of course, this leaves open for discussion the role of expert (political elite) surveys in the domain of interest of this centre. It would probably mean close collaboration with the CSD in this regard.

As for the other centres, the CID is not aiming at producing data by itself. The main aim is to gather and coordinate various efforts of data collection which are pre-existing. Most of the projects currently participating in the MEDem project already collect their own contextual data (for instance, CSES has already a wealth of contextual variables from district level variables to “macro level variables”). A number of projects outside of MEDem at the present stage have also already provided amazing efforts for combining existing indicators (for instance the QoG dataset or, in more specialized area, the coding of electoral systems by M. Golder or D. Caramani). 

In this sense:

  • The main task of the CID lays in the networking and organizational efforts to lay out the map of comparative project dealing with contextual data relevant for MEDem, reaching out to them, and trying either to convince them either to participate directly in MEDem or at least to allow the utilization of the data they produce.


This has for course to be done in close collaboration with headquarters and also the other centres. Beyond contact with existing comparative project, it is likely that a network of national collaborators would be needed. In this regard, working with existing projects on a list of national contact points and their relevant field of expertise would be an important step towards more integration.

  • The second broad task of the CID is working on the data and data format. As a national social science archive, CDSP is quite used to this vast enterprise of integration. Once again in collaboration with CAD, standards for data documentation and data formats have to be set up.


If DDI 3 is a likely target for documentation, an important task would be establishing the strategy for missing data imputation and trend estimation. It is indeed very likely that the main output of the CID be a type of database with country-year characteristics as the main unit of analysis. The CID proposes to add a database at the individual candidate level as well; party characteristics could be in the same way be considered for future work of integration. The ParlGov project (a database comprising parties, elections, and cabinets as main entries) has clearly set the path in this direction.

  • The third broad task for CID is the promotion of contextual data, in close collaboration with the headquarters and the CAD. A number of projects have indeed shown that contextual data does not always meet the expectations in terms of use by the academic community. Easier access, better country coverage and expanded geographical area are likely to foster scientific interest.


Yet, it may well be insufficient to justify the effort if nothing else is done. That is why data promotion should also be at the centre of the strategy of the CID. Up to a certain extent, networking with existing comparative projects and national country experts is likely to provide a first solid audience for the CID. If a specific academic conference may not seem to the most efficient strategy, specific workshop targeting methods for contextual data use would of course be key. This kind of workshop can also be easily integrated in summer school projects.


A second strategy could consist in dedicated help for research groups aiming to use contextual data. A pool of experts in support of the centre could be constituted so as to provide up- to-date advice (either for specific question or more long term developments). CID would coordinate these experts to ensure efficient response. This of course does not preclude the use of other traditional means for data promotion, from actual research to organization of events and social media animation, depending on the future communication strategy of MEDem.