Create and organise
Documenting and organising your data could be an overwhelming process; especially if there are more than one researchers involved in the project. Here are some useful tips.
Data format is a crucial aspect of the research cycle as digital data can become obsolete if not preserved in a standard format. Some proprietary formats (e.g. SPSS and Microsoft Excel) are widely used and likely to be accessible for a reasonably long, but still limited time.
The UK Data Archive provides a table of file formats that they recommend.
The key to ensuring best practice of organising files is structuring your folders in a logical way so they are easily locatable and accessible. You can view examples of file structures in the UK Data Service website.
A file’s name is the principal identifier for a file. Selecting a meaningful name can save time, and help with long-time preservation of data. You should choose a file naming convention early on to ensure that there is consistency throughout your research; and that is understandable by all parties involved in the project.
In addition, the UK Data Service recommends the following as factors of good practice:
- create meaningful but brief names
- use file names to classify types of files
- avoid using spaces, dots and special characters (& or ? or !)
- use hyphens (-) or underscores (_) to separate elements in a file name
- avoid very long file names
- reserve the 3-letter file extension for application-specific codes of file format (e.g. .por, .csv., .odf, .tiff)
- include versioning within file names where appropriate
Examples of useful file names
- FG1_CONS_2010-02-12.rtf : interview transcript of the first focus group with consumers, that took place on 12 February 2010
- Int024_AP_2008-06-05.doc : interview with participant 024, interviewed by Anne Parsons on 5 June 2008
- BDHSurveyProcedures_00_04.pdf : version 4 of the survey procedures for the British Dental Health Survey
It is important to ensure that you can distinguish between different versions of your folders/documents; especially, if they are saved in different locations and controlled by multiple users.
Things to think about:
- How many versions of the same folder/document do you want to keep?
- Do you want to save versions that have major or minor amendments?
- Are you going to save the master (i.e. final copy) and milestone (i.e. working copy) versions in the same location? If yes, how are you going to separate the two?
- Do all users have access to the folder’s/document’s location?
- Create a version control table table to keep track of your folder’s/document’s changes
- Use total numbers for major changes, for example v01, v02, v03
- Use decimal numbers for minor changes, for example v01_01, v01_02, v01_03
Why is documentating data important?
- It explains how data were created, collected and digitised
- It ensures that data are understood during the life of the research, but also when they have to be reinterpreted by other researchers.
What is metadata?
Metadata is data about your data that helps others understand your data. Metadata gives answers to the questions of why, what, when, where, how and by whom the data was collected, so it’s not just understandable but also effectively reusable by other researchers.
Not all research data can be digitised (i.e. interview tapes, sensors, laboratory notebooks) so measures should be taken to ensure that:
- they are digitised when possible (i.e. by scanning or taking digital photos) and stored in an institutional shared drive that is accessible to all parties involved.
- stored properly and safely when digitisation is not possible. More information will be provided at the Preserve & store section.
Personal Data is information that ‘relates to an identifiable living individual, as well as information which, when combined with other data accessible to the researchers, would permit the individual’s identification’. (JISC)
Sensitive Personal Data relates to the subject’s:
- ‘Racial or ethnic origin
- Political opinions
- Religious beliefs or other beliefs of a similar nature
- Membership of a trade union
- Physical or mental health or condition
- Sexual life
- Commission or alleged commission of any offence
- Involvement in criminal proceedings for any offence or alleged offence committed by them, including outcomes such as judgement and sentencing.’ (JISC)
It should be noted that there are specific conditions when processing sensitive personal data.
*There are legal and ethical obligations when you conduct research with human participants. For more information please visit City’s webpage on How to apply for ethical approval.
Personal and sensitive personal data fall within the remit of the Data Protection Act 1998. However, anonymised data do not fall within such remit as it is impossible to identify participants. Having said that, it is important to acknowledge that many methods produce pseudonymised data rather than truly anonymous data; careful consideration should be taken on the methods used to anonymise data.
A key principle of data protection is ‘data minimisation’. If data is not collected, the risk of its future misuse is automatically reduced. You should consider for example, whether satisfactory research outcomes can be achieved without collection of personal data or with only minimal collection of personal data. This might mean that your research does not require a subject’s date of birth, just their age range, or just the first part of their postcode rather than their full address (Data Protection and Research Data, JISC).
More information on how you can effectively anonymise data can be found on the JISC website.
Accessing/sharing personal data
In some instances restrictions of data is mandatory. However, this does not mean that the data cannot be shared. The most common way of sharing sensitive personal data is by putting in place strict procedures that users have to follow before they access such data. Please contact your School’s Contract Manager for advice.
More information on making data accessible can be found in the Publish & share section.