Analytic tasks and CAQDAS tools

These materials have been developed as a result of qualitative innovations in CAQDAS (QUIC) research into the ways researchers learn about and use computer assisted qualitative data analysis (CAQDAS) packages.

'Moving beyond initial coding' discusses software tools which support the analytic tasks that can be usefully employed after the initial coding phase while 'Combining and converting qualitative and quantitative data' explores the ways in which CAQDAS packages can be used to support the integration of qualitative and quantitative data in mixed method research.

Our sources include the common questions asked of the CAQDAS networking project’s email and telephone helpline (details in the footer) and questions raised by researchers attending our intermediate to advanced training workshops.

Moving beyond initial coding

We have compiled this resource to provide some ideas or starting points as to how particular tools might be used creatively to view and think about coded data differently and thereby facilitate analytic development. The ideas presented here may be usefully employed in a range of approaches to qualitative data analysis, certainly all those which employ coding. They are also relevant to the full range of software packages.

The queries received by the CAQDAS networking project highlight that researchers frequently know what they want to achieve analytically, but do not yet feel confident enough with their chosen software package to do so. CAQDAS packages provide several tools that can facilitate the processes of what many refer to ‘moving on beyond coding’. Our longitudinal project tracking researchers’ use of a range of CAQDAS packages has also highlighted ‘moving beyond coding’ as a common concern or ‘sticking point’ when using software.

Thematic or conceptual coding in CAQDAS packages is straight-forward from a technical point of view. It is very easy for example, to apply codes to segments of data and the user has complete flexibility concerning basic coding tasks. It is important, however, to bear in mind that although qualitative coding has analytic purpose, it is not analysis in and of itself. Although coding may be an early stage of analysis, at some point the researcher needs to move beyond the process of descriptive, thematic or conceptual cataloguing or indexing of data.

Further reading

For more information on this research please see the briefing paper ‘Research design for longitudinal case study project’ briefing paper. For more information on qualitative data analysis more generally see the methodologies section of the online QDA website.

In this section

  • Horizontal retrieval by code
  • Re-coding broad themes to generate hierarchical coding schema structures
  • Vertical retrieval by data file: the interactive margin view
  • Stepping away: outputting coded data

Viewing data for reflective purposes: basic retrieval options

Data can be retrieved in a variety of ways. As coding proceeds it is always possible to retrieve all the data so far assigned to that theme. This may simply act as a means of reminding yourself of similar segments in order to enhance continuity if you have not thought about that code for some time. Or it may be a key analytic task when, for example, there is a need to reconsider a theme, perhaps in order to break data down into sub-categories (e.g. when working deductively) or generate a general concept from selected more detailed codes (e.g. when working inductively).

We do not discuss writing tools in detail here but it should be emphasised that noting down what is being seen and the rationale behind analytic decisions is an important part of systematic qualitative analysis whatever the methodology. Whether working inductively or deductively, or using a combination of both approaches, reflection is facilitated by iterative retrieval. The ease with which software enables changes to the way data are coded (for example, un-coding irrelevant data or expanding the size of a coded passage etc.) are key components of working in an iterative, reflexive and rigorous way. This goes hand-in-hand with writing about the tasks being undertaken and how they are contributing to the interpretation.

Further reading: Lewins and Silver (2007), chapters 6-9.

Horizontal retrieval by code

Retrieving all the data so far coded at a particular theme will provide a horizontal view across the entire dataset, facilitating early consideration of the similarities and differences between data segments coded amongst different data sources. Such retrieval is usually very easy to achieve, without the need to create complex queries. Subsequently it is also possible to specify retrieval options more precisely, for example to retrieve one or more codes only in certain types of data file or applied to data contributed by respondents with particular socio-demographic attributes etc. It is usually necessary to set up a query to retrieve data in more complex ways. It is often equally easy to retrieve data coded at two or more similar codes in this way. Software packages differ subtly in the ways they present basic retrieval, but options usually offer alternative means of representing data. See individual software reviews for more specific information on the options provided by particular CAQDAS packages.

As well as retrieving coded data, qualitative data frequency information about coding can be presented and sorted in a number of formats. This provides a summary overview of current coding status, for example broken down by frequency of code application across the whole dataset or within a particular data file or set of respondents. Some packages provide full integration between tabular views and the corresponding qualitative data. All qualitative approaches benefit greatly from establishing a balance between visualising summarised frequency information which highlights clusters and gaps and being located within the source context.

Re-coding broad themes to generate hierarchical coding schema structures

When retrieving broadly coded data horizontally it will be possible to re-code into more detail in order to take account of the more particular aspects of that general theme. This can be achieved without the need to un-code from the initial broad category, although this can also be achieved if required. Researchers using visually hierarchical coding schema structures often work very effectively in this way, establishing a relationship between top level codes and their sub-codes in terms of the data coded in each position.

Even when taking an inductive approach to coding hierarchical coding might be a useful way to proceed. For example, if fairly broad categories are identified early on, it may prove more effective to continue coding in a fairly broad-brush manner initially in the knowledge that the nuances of the category and the way it is represented in the data can be easily assessed later on. It can often be easier and more systematic to undertake detailed analytic consideration concerning a theme when all generally related data have been collected together. It is often challenging to code in an inductively sequential way through a data file because you are required to think deeply and analytically about all conceivable aspects of every theme simultaneously. Working in a more broad-brush way for certain themes and re-coding at a later stage enables close consideration without other data obscuring thought processes.

This is one example of how software tools can influence analytic procedures. In practice, we find that when taking an essentially inductive approach to coding it can be useful to code the first few documents in the conventional inductive way as a means of remaining faithful to broadly ‘grounded’ approaches. This will generate a long list of fairly detailed codes. However, even when only a relatively small number of data files have been coded it will usually already be possible to establish a number of broad themes that may benefit from treatment in the way described above. As such we often find that we combine inductive and deductive approaches to coding and flick between different ways of working quite frequently and loosely.

Further reading: Lewins and Silver (2007), chapters 5-7.

Vertical retrieval by data file: the interactive margin view.Those adhering strictly to analytic strategies developed before the rise of qualitative software or independently from it, can nevertheless make good use of CAQDAS packages. Software tools need not solely be used for the analytic purposes for which they were designed. In fact we encourage critical evaluation of software tools and the manipulation of conventional tools for unconventional purposes. Indeed most software developers are very receptive to the feedback from users and new tools are often developed in response to users demands.  

Retrieval of several or all codes sequentially through an individual data file offers a vertical display. Such retrieval is enabled in various ways. Usually an interactive margin display showing precise code application and the co-occurrence of codes is automatically on view and is constantly updated as new codes are applied. This reflects the way that many qualitative researchers work manually, using highlighter pens. Some packages allow the user to define the colours of codes appearing in the margin. This is a relatively basic function but the analytic utility of the systematic use of colour from the outset cannot be underestimated. In our experience the way any colour-based functionality is used depends largely on the style of working and way of thinking of individual researchers. Some people are simply more visually oriented and find these tools particularly attractive. Although colouring codes is a useful cosmetic visual device many packages allow the code margin view to be flitered according to the colour(s) applied to codes. This can be a powerful retrieval device. In addition the colour coding function in MAXqda which physically highlights coded data in one of the four available colours, and applies the corresponding colour name as the code label.

Whichever software is used, the basis for colouring codes is user-defined and may be driven by any analytical, theoretical or practical purpose. If analysis is framed by existing theoretical ideas, for example, consistent colouring of codes which reflect each theoretical construct can facilitate the subsequent visual interpretation of co-occurrence and commonality. Complex team-based projects can also make effective use of colour attributes, although in such circumstances being systematic in application is of particular importance.

It is usually also possible to filter the codes on view in the margin for alternative analytic purposes. During early coding for example, identifying codes which commonly occur together in the data can produce insights which contribute to refining subsequent code applications. At a more practical level, just being able to see which data have been most heavily coded, or where the gaps are, can contribute to the process of identifying patterns and relationships. The visual overview that a margin view provides therefore has important analytic potentialities.

At later stages setting filters to refine the codes on view in the margin enables concentrated consideration of particular codes without being distracted by other themes. Margin views can get very cluttered, especially in packages that show items other than codes appearing in the margin. It is therefore a useful device for practical as well as analytic purposes to ‘clean-up’ the view in order to be able to clearly identify patterns and relationships in the application of codes.

Margin views are usually interactive, providing quick access, for example to all the data coded to a particular theme across the whole dataset from a particular point in the file. It is usually also possible to execute other tasks from points in the margin view, such as un-coding data, re-sizing data segments, renaming codes etc. The ability to flick backwards and forwards between these different views provides the flexibility to view the same data from different perspectives, thereby contributing to interpretation.

Stepping away: outputting coded data

Most packages provide several ways of outputting coded data for consideration outside of the software and away from the computer. Output reports which reflect basic retrieval options within the chosen software can be generated at any point and printed or saved outside the software. It is also possible to output (parts of) data files or coded data segments to print and work with manually. Those familiar with using highlighter pens to code manually often prefer to combine working in this way with utilising software tools, and packages which automatically number paragraphs and enable the clerical application of codes according to paragraph ranges can be attractive.

Output reports may take the form of summarised overviews, perhaps detailing frequency information for further manipulation in a spreadsheet or statistical application. This is one way in which mixed methods projects can be facilitated by qualitative software. For more information on integrating qualitative and quantitative data using CAQDAS packages see the pages on combining and converting qualitative and quantitative data and analysing survey data. Summarised frequency reports can be particularly useful if conducting a longitudinal study when it is important to create snap-shots of the coding status at the end of each stage in order to facilitate temporal comparisons.

In complex team situations when not all members are directly involved in using the software, sharing coded data via output reports allows those researchers to contribute to processes such as considering the appropriateness of coded data and refining coding schema.

In this section

  • Coding schema structures: the implications of hierarchies
  • Short-cut groupings
  • Set creation when working deductively
  • Set creation when working inductively

Thinking conceptually without it ‘mattering’: code grouping tools

Initial coding is often essentially a process of data fragmentation whereby important aspects of what is ‘going on’ are indexed. Visualising the coding process graphically reveals codes as ‘stars’ such that simple links are established between the code (which may represent a theme, idea, concept etc.) and general or more specific examples of it in the data. By creating such connections you are not, however, making any assertions concerning the relationship between the individual data segments themselves, other than that they are in some way ‘about’ the theme or topic represented by the code label. You are simply stating that ‘this data segment is about this theme’.

CAQDAS packages provide several means by which codes can be grouped or connected in order to start making more solid statements about the relationship between themes, or in generating higher level concepts. These include the creation of visually hierarchical coding schema; the use of short-cut code groupings; and the establishment of functionally hierarchical or semantic links between codes.

Further reading: Lewins and Silver 2007, chapters 6-7.

Coding schema structures: understanding the implications of hierarchies

Researchers often have quite particular views on the use of hierarchical coding schema structures, some feeling constrained by their perceived inflexibility, others feeling comfortable within logical and ordered structures.

Further reading

Lewins and Silver 2007, pp91-115. 

It is important to be aware of the technical aspects of hierarchies in your chosen software as they do not always function as expected, but it will usually be possible to manipulate coding schema structures to suit analytic requirements. For example, in packages which do not provide a visually hierarchical coding schema in its main listing (e.g. ATLAS.ti), it can be useful to pre-fix similar codes as a precursor to merging or creating sets, or simply to create some thematic or conceptual order in an alphabetically ordered list. This can also be a useful procedure in packages which do provide visually hierarchical coding schema.

Whether initially working inductively or deductively may have an impact on the way hierarchical coding schema structures are generated and function. For example, when working inductively it is common to first generate a relatively long list of fairly detailed, unorganised codes. Later on – when the long list starts to become practically unwieldy or when similarities and differences between individual codes become more apparent –  the list may usefully be reorganised to make collections of themes. Depending on the chosen software the top-level code may ‘contain’ all the data coded at the lower levels; i.e. the collection functions hierarchically, or it may simply function as a ‘pointer’ to the lower level coding; i.e. the top-level code remains ‘empty’.

In contrast, when working deductively the coding process may commence with a smaller number of broad themes to which data are initially coded in fairly general terms. Having collected all data about each theme it may then be necessary to re-code that data into more detailed, sub-categories. This enables the analytic work of deconstructing general themes to occur at the point at which all the data generally about an issue has been collected together. New, more detailed codes are placed directly under the main theme as sub-codes, producing a functionally hierarchical collection of codes.

A definitive choice between the two procedures need not necessarily be made as many projects benefit from adopting a combined approach. As long as the researcher is aware of how particular coding schema structures function technically and analytically, there is no reason why each collection of codes need to be utilised in the same way.

Short-cut groupings

Most CAQDAS packages provide the ability to create short-cut groupings of codes which augment the main coding schema, whether represented hierarchically or as a flat list. These groupings (usually called ‘sets’) can be created at any stage for a range of practical, analytical or theoretical reasons. They cut across the main coding schema structure to provide alternative ways of grouping codes to represent and explore patterns and relationships.

Practical reasons for creating code sets may include the following:

  • To act as reminders of codes which require further in-depth consideration
  • To gather together codes which were generated by particular team members
  • To gather together codes which contribute to the argument to be made in a particular thesis chapter or journal article etc.

Analytic or theoretical reasons for creating code sets may include the following:

  • To gather together codes which contribute to the answering of particular research questions
  • To cut across the structures of the main coding schema in order to consider alternative relationships
  • To gather together codes which represent examples of theoretical ideas or hypotheses
  • To gather together codes which are pertinent to particular individuals, cases or events etc.

The key benefit of sets is that because they are short-cut groupings, they need not affect the main coding schema in any way. Indeed, the main coding schema can continue to be viewed completely independently of sets. Rather, sets act as an additional layer of coding; as ‘sign-posts’ or ‘hunches’ for further investigation. The same code can belong to as many different sets as required because codes remain physically located in the main coding schema and the sets simply provide short-cut access to them. As such sets afford great flexibility in considering groups of codes in alternative ways, without affecting the main coding schema which will have been given considerable thought. In addition, sets can be created speculatively as they need not be used if they are subsequently revealed as being unhelpful. As such they can act as a means of testing ideas without altering or damaging the main coding schema structure. This can be incredibly valuable when conducting qualitative data analysis, regardless of approach.

Set creation when working deductively

We have found the use of sets equally useful when working deductively or inductively. When working deductively it may be quite important that the main coding schema remains constant, for example if testing existing hypotheses on a new dataset, conducting a longitudinal project, or working in a complex team situation where several researchers are coding and there is a requirement to incrementally merge separate software projects. See our teamworking pages for more information of working in collaboration with others using CAQDAS packages. Using sets in such circumstances enables the researcher to step outside of the constraints of the main coding schema, fostering the likelihood of identifying the unexpected whilst maintaining the main focus as represented in the coding schema.

Set creation when working inductively

When working inductively it is common to generate a large number of codes during the first pass through the data. It is not unusual, for example to have a list of several hundred codes when working completely inductively. Researchers often feel overwhelmed at this point, reporting a sense of ‘not being able to see the wood for the trees’. There are various strategies for achieving a greater sense of coherence over the coding schema in such circumstances. The creation of code sets can be particularly helpful, both during early stages when it feels ‘risky’ to delete or merge codes, and in latter stages when the need to impose more order on an unwieldy coding schema becomes more pertinent.

In this section

  • Linking data segments
  • Linking codes

Creating links between data and codes

Coding is essentially a process of cataloguing or indexing data and is useful for grouping similar data segments in order to consider their meaning. However gathering several similar data segments at a code is simply a process of stating that those segments are generally ‘about’ this topic. Coding therefore does not facilitate the researcher in making statements about the relationship between data segments or themes. Many CAQDAS packages do, however, provide linking tools which enable the creation and exploration of more specific patterns and relationships in the data. Here we discuss linking data segments and linking codes as two important aspects of ‘moving beyond coding’.

Linking data segments

Many software packages enable the creation of specific linkages between data segments that may more powerfully represent connections. This is useful when there is an analytic need to create and track associative trails through the data without abstracting to the conceptual level.

Further reading: See Silver and Fielding (2008) and Silver and Patashnick (2011).

The ways these linkages are created and visualised differ significantly between packages, as do their functionality in terms of subsequent data retrieval. Some packages, for example allow links between data segments to be visualised in a map (see the next section for more information on mapping tools) and for links to be user-defined and labelled, whereas others are less flexible. In addition are retrieval options, with some packages utilising ‘functional’ links, such that data segments can be retrieved on the basis of the existence of certain types of links. Other packages allow links to be made between data segments but have only very limited means by which to retrieve data subsequently. Where there is an analytic need to remain faithful to the narrative of data, or to track non-thematic or non-linear processes, these data linking tools are particularly useful. For software specific information on such tools see the individual software reviews.  

Linking codes

Linking codes is a means of stepping away from the level of the data and thinking more conceptually. There are many reasons for linking codes. For example, codes may be linked early in a project as a way of representing a priori theory which is being tested, toward the end of a project as a way of visualising findings, or as an analytical device throughout a project to help develop connections. Linking codes is often achieved most productively through the use of mapping tools.

In this section

  • Displaying data in maps
  • Mapping prominent features
  • Representing theories
  • Rethinking coding schema structures
  • Viewing and creating conceptual links

Stepping back to view data at a more conceptual level: the utility of mapping tools 

It is often illuminating to step away from the minutia of the data and think at a more abstract or conceptual level. This allows data to be reconsidered and reconceptualised. Integrated modelling, mapping or networking tools are designed to facilitate these processes and can be useful for a range of purposes. Here we focus specifically on the graphic consideration of aspects of data as a key analytic task in facilitating the process of moving on beyond basic coding. 

CAQDAS packages provide mapping tools in quite different ways and therefore if the development of conceptual maps is a key aspect of your methodology it will be important to pay particular attention to these differences when choosing software. See the individual software reviews for more detail on the differences between individual CAQDAS packages in this regard. 

Creating maps allows you to illustrate aspects of a project and to view and think about data and concepts in different ways. As well as thinking about data sequentially or thematically, maps enable data and concepts to be considered spatially or semantically and for relationships between analytic aspects to be established and explored. Positioning objects graphically can facilitate the identification of patterns or relationships which can be more difficult to see when working purely at the level of the data itself or at the conceptual or thematic (coding) level. Maps can vary from the quite simple to the complex, in either case they are important reflective tools helping to clarify ideas and refine interpretations. As such they can act as an important means of ‘moving on analytically’ regardless of analytic approach. 

Here we list and discuss a number of ways in which the creation of maps within CAQDAS packages can help researchers ‘move on’ beyond the early stages of coding: 

  • To display coded data visually and reflect on its meaning
  • To map the prominent features of an individual respondent, case, or event in terms of how it has been coded thus far
  • To represent an a priori theoretical framework and to compare it with current state of thinking as indicated by early codings
  • To rethink coding schema structures, perhaps as a precursor to making significant and lasting changes
  • To view the existence of patterns and relationships based on the coding so far achieved
  • To create links between codes which represent ‘relationships’ which might cut across coding schema structures – as a way of visually representing a developing interpretation when working inductively 

Displaying data in maps

Some packages (e.g. ATLAS.ti and Qualrus) allow data segments to be viewed within a map and to be coded, un-coded, re-coded or linked to other segments as appropriate from within the map view. This provides the ability to work in much the same way that one might if carrying out an analysis manually, using index card systems or the like to thematically sort and stack data segments, thereby generating higher order categories. The benefit of the computerised version is that these data segments need not be copied to be assigned to different codes, and that the source context from whence they derive is always simply a click away.

Mapping prominent features

Where there is a need to develop a typology or to write a thick description of an individual’s experience, a case or an event, maps can provide a powerful means of representation. Focussing on prominent features of one aspect of a project in this way provides a visual means of comparison.

Representing theories

When working deductively analysis might be framed by existing theoretical constructs which can be summarised within maps. Representing these frameworks within the chosen CAQDAS package can act as a useful reminder to ground analysis as well as a means by which to visually compare existing theoretical ideas with the current dataset. If maps which represent theory are generated before the coding process commences, they can subsequently be compared with maps generated directly from data, thus facilitating an assessment of whether the theory holds up in different settings.

Rethinking coding schema structures

As a supplement to or substitute for the use of sets, maps can help sort existing coding schema structures. Visualising codes graphically and moving them around can facilitate the process of creating logical groupings or linkages to represent as coding schema structures.

Viewing and creating conceptual links

Maps are key ways in which simple or more complex relationships can be graphically outlined. Some packages (e.g. MAXqda and NVivo) reserve mapping tools for more conceptual tasks, enabling codes, documents and other objects, but not data segments themselves, to be visualised within them. That said, coded data can always be accessed from within a map, so even if data segments cannot be displayed within maps they can contribute directly to conceptual work.

In this section

  • Matrices and charts
  • Graphic visualisations

Alternative visual representations

Representing data in various ways can facilitate the processes of analysis and more recently several CAQDAS packages have begun providing alternative visual representations. Some of these have been discussed above in relation to code groupings and mapping tools, but others include more quantitative representations (such as tabular presentations and charting functions) and qualitative representations (such as colour-based visualisations). Here we briefly outline some of these.

Matrices and charts

The generation of qualitative cross-tabulations in the form of interactive matrices have been available in some CAQDAS packages for many years, although more recently these have become more widespread. Numeric or quantitative summaries of qualitative data, which are usually in the form of frequency information concerning various aspects of coding, should always be viewed with an element of care, particularly where sample sizes are small. However, they can provide useful means of thinking about data and coding differently as they usually emphasize patterns and relationships in ways which may not be so obvious otherwise. Although other software applications usually have more sophisticated means of generating and manipulating charts the benefit of using in-built CAQDAS versions is the maintenance of the link with the qualitative data that underlies the chart or table. Being able to check the basis upon which numeric data is calculated is of utmost importance and CAQDAS packages are predicated on this principle.

Graphic visualisations

As well as matrices and charts, some CAQDAS packages provide alternative graphic visualisations which represent aspects of a project, usually based around how data have been coded. Some of these representations rely quite heavily on colour to distinguish between, for example, how frequently particular codes occur or sequential concentrations of coding.

Coding is at the heart of most CAQDAS packages and many qualitative approaches which utilise them and as such software coding tools are easy to use, flexible and powerful. However, coding is only one of the core tasks of conducting qualitative data analysis and there is a risk that analysis can be stunted if researchers cannot find ways to move beyond essentially descriptive coding. The material presented here is designed to highlight the various software tools available which are specifically designed to facilitate the move from basic coding tasks to more analytical processes. They are not exhaustive and many researchers may find alternative ways of proceeding, but we hope to have outlined some of the main tools which can help in the process of ‘moving beyond coding’, which was identified in our research of researchers use of CAQDAS packages as a common concern or ‘sticking point’.

For more information about the similarities and differences between particular software packages see the software options area of this website. For more detail on the aspects covered here see Lewins and Silver (2007). Please note that this resource does not provide step-by-step information about how to achieve these tasks in given software packages, nor is it fully exhaustive.

Attend one of our training courses or visit the Resources section for further links to other relevant publications and materials.

Combining and converting qualitative and quantitative data in CAQDAS packages: an aid to mixed method research

In this section we outline how different CAQDAS packages (e.g. ATLAS.ti, NVivo, MAXQDA, QDA Miner) enable the combining and converting of qualitative and quantitative data in order to support mixed methods research. This section is a starting point for this type of work; we do not provide step-by-step instructions for carrying out these tasks here, but overview some common capabilities.

As conducting mixed-method research becomes increasingly popular, interest in ways to combine and convert qualitative and quantitative data with the support of CAQDAS packages is also on the rise. In addition to an increase in literature discussing the issue and software packages implementing new tools to support mixed-methods approaches, we see this trend reflected in the questions asked at our qualitative software seminars and training sessions and requests for information received by our email / telephone helpline. In response to this need we provide materials to support those who are new to the topic.

Further reading

For more information consult individual software manuals and tutorials or visit related pages on this website such as analysing open-ended responses to survey data using CAQDAS packages.

In this section

  • Linking quantitative information with qualitative data
  • Adding value with known characteristics
  • Open-ended responses to survey questions
  • Importing quantitative data into CAQDAS packages

Combining qualitative and quantitative data

Note: This section describes and discusses the value of importing quantitative information about qualitative data and points to some basic differences in the ways leading CAQDAS packages enable this.

The ability to work with several different types of data when using CAQDAS packages has been a feature of many since their inception. However the range of types of information and data that can be directly handled has mushroomed in recent years with most now handling many types of multi-media data (including for example, still and moving images, web-based and Google Earth records) as well as any form of textual data (more recently including PDF files).

In addition, the ability to introduce quantitative information in the form of attributes or variables to handle the known characteristics or ‘facts’ about data or respondents has been a long-standing common feature of most mainstream CAQDAS packages. But several packages have recently added specific tools to aid mixed-methods approaches. These include specific routines for importing survey data as well as some more refined data interrogation tools.

a. Linking quantitative information with qualitative data

Even when working with a small-scale qualitative data set there will be some quantitative ‘facts’ or ‘characteristics’ about data that are known and which can be linked with qualitative data for analytic purposes.

i. Adding value with known characteristics

The socio-demographic characteristics of interview or focus-group respondents, the features of an observational fieldwork setting or the date and time at which a blog entry was posted are all characteristics which may be collected or known about qualitative data. Even where sample sizes are small and the aim is not to make general statements about the wider population, taking account of these characteristics and exploring data according to them is a valuable analytic task. At the other end of the continuum a large-scale survey including open-ended as well as closed questions comprises a large amount of quantitative information which can be usefully linked to the textual responses. Alternatively, mixed-methods projects which include data collected both qualitatively and quantitatively usually require some means of combining the two, whilst maintaining the ability to interrogate them separately. This may especially be the case when quantitative data is held about qualitative data. For example, a survey of 2000 respondents may have been conducted, followed by in-depth interviews with a small sample of those survey respondents in order to investigate particular themes more fully. In such a situation linking respondents’ socio-demographic characteristics and survey responses to their qualitative comments in the interview will be imperative in order to interrogate the relationship between measures, trends, experiences and attitudes.

Most CAQDAS packages enable this sort of information to be linked to qualitative data either manually within the software, or by importing or assigning information that has been externally collated and prepared in a spreadsheet application, such as Excel. Figure 1 below illustrates typical factual, in this case, socio-demographic, information prepared in a spreadsheet application ready for incorporation into a CAQDAS package in order to be linked to the corresponding qualitative data.

Figure 1: Socio-demographic information ready to be imported into a CAQDAS project


Where there is a longitudinal aspect to a project, with data being collected at different stages, the incorporation of known characteristics enables both within and cross-case analysis. For example, an individual respondent’s experiences could be tracked over time (within case analysis) and/or a particular theme could be explored amongst all (or a subset of) respondents at one time point (cross-case analysis).

Whatever the characteristics of the research project, when using CAQDAS packages to facilitate the analysis, the importation of known characteristics such as socio-demographic attributes will add value to the interrogative possibilities of an analysis, enabling, for example, the comments or experiences of groups of individuals to be explored according to age, gender, marital status or any other known characteristic. This enables either a focus on one particular subset of data, for example retrieving all the data coded to a particular theme, say, ‘happiness’, amongst only female respondents, or a comparison of a group of themes, say, attitudes to government policy, amongst different types of people, for example respondents of different age categories, or occupations.

Once known characteristics have been imported or assigned appropriately, query tools can be used to carry out a full range of interrogations. Most CAQDAS packages provide similar queries to make use of known characteristics, although terminology is often different and the ease with which to create queries or otherwise interrogate data varies.

ii. Open-ended responses to survey questions

Open-ended responses to survey questions are a form of qualitative data that are often neglected in survey analysis. Yet CAQDAS packages provide opportunities to systematically analyse them and to integrate the results with quantitative survey findings. Indeed many CAQDAS packages now provide specific routines for importing survey data such that responses to open (qualitative) and closed (quantitative) questions can be linked and analysed. For detailed information and routines for working with responses to open-ended survey questions using selected CAQDAS packages see the analysing survey data pages.   

iii. Importing quantitative data into CAQDAS packages

Quantitative attributes or variables can be created manually within CAQDAS packages and linked with corresponding qualitative data. However, this step-by-step process is often time-consuming and laborious and quantitative information is often already held externally to the CAQDAS package in a statistical or spreadsheet application. Even when this is not the case, once the project size exceeds a dozen or so respondents, it will usually be easier to collate relevant facts externally and then import them, rather than to do so within the CAQDAS package.

Although the format in which attribute information needs to be prepared and the routines for importation vary, all CAQDAS packages offer means of importing quantitative information and linking automatically to corresponding qualitative data. In order to ensure successful importation users are advised to follow software-specific routines outlined in software developer manuals or tutorials. Routines for importing survey data into selected CAQDAS packages can be found on this website under analysing survey data.

In this section

  • Within the software
  • Basic summary information
  • Code frequency information
  • Code-co-occurrence information
  • Charting and other quantitative visual representations
  • Outputting quantitative information
  • Summary frequencies and reports
  • Tabular export options

Converting qualitative data to quantitative representations

Note: This section provides an overview of ways CAQDAS packages enable qualitative data to be converted into quantitative representations, distinguishing between tools provided within software and options for outputting quantitative information for further analysis.

The use of software to support the analysis of qualitative data has opened up qualitative research to the possibility of working with much larger datasets than is logistically possible otherwise. The rise of mixed methods approaches also precipitates more interest in quantitising qualitative data.

CAQDAS packages have always provided some basic frequency information (i.e. how often a given code was assigned to the data) about, for example, how data are coded, but some recent developments have placed more emphasis on alternative quantitative means of interrogating and representing qualitative data.

a. Within the software 

i. Basic summary information

CAQDAS packages hold a variety of basic summary information about data and analyses in numeric format. This ranges from rudimentary information concerning the number of data files, memos, codes etc. in the project to more complex coding frequency information. The former is useful in describing features of the analysis when writing up, the latter in undertaking the analysis. Code frequency information includes basic information concerning the number of data segments coded at a particular code, either across the dataset as a whole or amongst a subset of data; for example in data of a certain type or among respondents with particular characteristics. Such frequency information informs the development of themes and concepts. For example, codes with high frequencies may be indicative of salient themes whilst those with low frequencies may be redundant. Although frequencies alone are insufficient in themselves in identifying salience, clusters and gaps in coding provide a useful starting point for taking the analysis beyond the early stages of data indexing or descriptive coding. For more on this topic see 'Moving beyond initial coding' above.

ii. Code co-occurrence information

Code co-occurrence refers to when two or more codes overlap in the data. CAQDAS packages provide a range of ways of searching for code co-occurrences, usually by interrogating the dataset through a query tool. Query options do differ in some significant ways between packages however, so it is important to become familiar with the terminology and functionality of the chosen package before relying on the results of any query.

The desired result of co-occurrence searches is a collection of the qualitative data segments to which the searched for codes apply, but most CAQDAS packages also provide numeric frequency information concerning the prevalence of co-occurrences. Some provide this in easily generated co-occurrence tables which do not require the use of complex query tools. Code co-occurrences can usually be generated across the whole dataset or as a means of making comparisons between subsets of the data or respondents.

iii. Quantitative visual representations

Many CAQDAS packages are responding to the interest in mixed methods approaches and the increasingly digitised and visual world by providing an increasing number of alternative ways of visualising and representing data. Many of these are essentially quantitative visual representations of qualitative data. Examples include basic charts and matrices to more complex representations in heatmaps and dendograms.

In this section

  • Outputting quantitative information
  • Reports and outputs

Outputting quantitative information

Note: In this section we provide brief information on outputting reports and tables.

As well as viewing quantitative information within CAQDAS packages, such data can be exported in various ways. This may be done simply as a means of presenting findings or in order to conduct further statistical analysis as part of a mixed methods project.

Further reading: See Bazeley 1999 and Fielding for more information on the latter.

a. Reports and outputs

CAQDAS packages provide report tools which are usually the basis of exporting information. As well as a full range of qualitative reports (e.g. data excerpts which have been coded in particular ways), quantitative summary reports can also usually be generated and exported. These often replicate aspects of basic frequency information as discussed above. Reports tools vary in the information which can be generated and the format in which they can be saved, but most enable numeric information to be saved in such a way that it can easily be viewed in a spreadsheet or statistical application. In addition, typically any table or list generated within a CAQDAS package can be easily exported. Providing the dataset is large enough this can then be subject to further statistical analysis.


Bazeley, P. (2018) Integrating analyses in mixed methods research. London: Sage

Bazeley, P. (1999). 'The Bricoleur with a Computer: Piecing together Qualitative and Quantitative Data', Qualitative Health Research, 9(2) pp.279-287

Fielding, J., Fielding, N. & Hughes, G. (2013) ‘Opening up Open-ended Survey Data Using Qualitative Software’, Quality & Quantity, 47(6) pp.3261-3276

Lewins, A., & Silver, C. (2007). Using Qualitative Software: A Step by Step Guide, London Sage Publications

Silver, C., & Fielding, N. (2008). 'Using Computer Packages in Qualitative Research', in Willig C & Stainton-Rogers W (eds.) The Sage Handbook of Qualitative Research in Psychology, London, Sage Publications

Silver C (2010) Research Design for Longitudinal Case-study Project: Tracking the use of software in real projects using different methodologies, QUIC Briefing Paper, Qualitative Innovations in CAQDAS (QUIC), The CAQDAS Networking Project

Silver C & Patashnick J (2011) ‘Finding Fidelity : Advancing Audiovisual Analysis using Software’, FQS 12(1), Thematic Issue: Is Qualitative Software Really Comparable?


We always welcome feedback concerning the relevance and usefulness of our resources. Indeed, many of the materials we provide are the direct result of repeated requests from students and qualitative researchers. So if you have downloaded any of the materials above please email us with your comments.