Handbook on Data Collection / Phase Six: Collect Data

From Akvopedia
Jump to: navigation, search
English Français

In phase four of the Handbook, we address the design of the survey content and sample. Phase five addresses the preparation necessary for data collection in the field, including logistic considerations, role allocation, and training of data collectors. In this phase, we focus on the process and considerations for data collection in the field. As phase five makes clear, it is essential that the process of data collection goes smoothly and the questions and purpose are well understood by the data collectors. The quality of the data is established from the moment the data collectors introduce themselves to the interviewees. This article will go through the five steps of data collection which are summarised below.

Collecting data in five steps

Step one: Conduct final checks before entering the field


Step two: Collect Data
Step three: Submit, process and maintain data
Step four: Check data quality as it is collected in the field


Step five: Finalise data collection

Step one: Conduct final checks before entering the field

If you’ve followed the steps covered in phase five of the Handbook, you should be close to sending your data collectors into the field. Here we will touch on a few final considerations to check off beforehand.


Crosscheck the communication system put in place in phase five.

  • How are field issues communicated within the project team, from data collector to supervisor to data analyst?
  • Is it clear which surveys are assigned to which data collectors?
  • Is the communication flow between data collectors, team leaders and supervisors clear?


Ensure all information is understood and accessible to everyone involved in data collection.

  • Consider creating an FAQ document for the data collectors if you haven’t already. Having a simple two-pager summarising important and relevant information for the survey and data collection process, and relevant project team contact details, can be extremely useful. This two-pager can also include survey protocols, definition lists, sampling guides, a day-to-day timetable of activities, and a checklist for equipment.
  • Review the scheduled list of activities.
  • Make sure that all data collectors understand their role and responsibilities, as well as the risks related to field work, and agree to them.
  • Consistency in the data collected and in sampling is essential for good quality data. In the field, data collectors need to be able to know what to do in case of an issue with sampling, for example, if the household head is not at home or if a water point is inaccessible, and who to notify. As mentioned in phase five, data collectors will need to have a common understanding of the questions in their survey so that the data collection can be consistent across the team.


Have transportation, agreements on daily service allowance, and accommodation been finalised? Data collectors and supervisors should have:

  • Clear ID, document or badge showing the organisation and its mission, preferably with local government endorsement.
  • The survey available on their smartphone, or a generous number of surveys printed and stored in an envelope.
  • A letter from the relevant authorities permitting them to administer the survey.
  • The FAQ two-pager outlined in the Information section.
  • Appropriate mobile phone credits and/or data packages.
  • A list of phone numbers to call in case of an emergency or when support is needed.
  • Information to give to the interviewee upon request or any other party concerned by the data collection. Ensure informed consent has been attained.
  • A clear understanding of the survey and sample design and the purpose of both. Please see phase five for further details.

Step two: Collecting data

There are many factors that determine the appropriate methodology for data collection. Identifying how, where, when and by whom will be influenced by the unique context of the project. There is a range of geographic, meteorological, cultural, political and linguistic factors that may influence how the data is collected and these, alongside the questions above, should guide any data collection methodology.


Ensure that global positioning system (GPS) points are taken when mapping out the site for the first time. Try to get as close to the water point as possible. When taking a photo of the water point, try to capture it in such a way that another person would be able to find it without your help in the future. For example, include the pump in context or with a familiar landmark in the background. Ask the following questions:

  • How is data on the water point currently captured and recorded? Who should you speak to about the water point?
  • What is the most appropriate method for capturing information about water inputs and outputs? Is it through observation and testing the water itself, or through talking with local users about their experience of extracting and using the water?
  • Is there a community or other group responsible for the upkeep of the water point?


General rules for data collectors, conducting interviews:

  • Wear culturally appropriate clothing, i.e. in some places, it may be inappropriate for women to show their legs or wear trousers.
  • Be respectful of an interviewee’s time and answers, as well as possible restrictions on their answers.
  • Prompt without leading the responses. Try rephrasing questions to get the information needed.
  • Record the answers that the interviewee gives, not the answers that the data collector feels they should have given.

Interviewing the right person

Who will you need to interview to gather all the required data? For instance, your survey has a question about how long it takes to collect water for the household. You are interviewing the household head, however they are not responsible for the collection of water. Instead, you’ll need to find and ask the person who does collect the water, if it is culturally appropriate to do so.

Restrictions on the interviewer

When collecting data, there may be a number of cultural limitations on when and who the data collector will be able to interview. Gender can come up as a potential restriction. For instance, it may be inappropriate for a male data collector to interview a female interviewee and vice versa. A potential practical solution to this is to have mixed genders in the data collection team.

Securing the interview location

Be conscious of the location and length of the interview. For instance, if you are recruiting in a market place, but need to ask some personal questions, it might be wise to find a secluded location to conduct the survey. This will again be determined by cultural norms.


Choose the timing of your survey that would be most convenient for the respondent. Be sensitive to the cultural and practical elements that may affect this. For instance, there may be times when the interviewees are attending religious ceremonies. If you want to interview farmers, they are unlikely to have time to fill in a long survey during harvest. If a participant is unavailable when the data collector arrives, find out when they will be available. What are potential times or days that your interviewees may be unavailable? What other factors could affect their availability or participation in the data collection?

Step three: Submit, process and maintain data

It is important to set definitive timelines for the collection, uploading, and checking of data. A common setup is for each data collector in a team to hand in their completed surveys to the team leader, who checks that they are complete at the end of the day before approving them. This also gives the team leader an opportunity to check the sampling progress and discuss any issues with the data collector during collection.

Database entry procedures

For the entry of data from the paper based surveys, those entering the data should have a clear format to follow. Include a spreadsheet with variables (questions) as column headings and each survey respondent as a separate row. Commonly, many data errors can occur here, so it’s essential to have a way of linking a specific paper-based survey respondent with the spreadsheet entry. This can take the place of an identifier (i.e gender, country, age could look like F-NL-30).

In the case of mobile data collection, setting a time limit for the data collectors to upload data will be determined by a number of factors:

  • Is this a monitoring survey? If yes, they may need to sync regularly throughout the day to ensure data is up to date.
  • Will the data collectors have access to the internet via mobile data packs, or only WiFi? If using mobile data, uploading will be more or less continuous. If not, uploading periodically at hotspots will be necessary.
  • When should they do the data syncing and uploading? Making sure the timing of uploading the data is clear and consistent. This will make it easier to coordinate data quality checks.

Data management protocol

Be mindful of data privacy and security considerations when collecting data. Consider the following questions:

  • How will you secure the completed surveys?
  • Will the data collectors carry the surveys with them, or hand them into the supervisor?
  • How does data get from the field to the processing location?

How will the original data be stored (i.e. paper storage, digitally or both)?

  • For paper-based surveys, consider how to securely store the paper so it won’t be affected by environmental factors, such as bugs or damp.
  • It may be preferable to scan the paper version to store digitally.
  • When storing data digitally, as an option, consider whether to use Cloud-based systems as copies from computers or in external hard drives. Security and encryption is essential.
  • If multiple agencies are involved in the data collection project, decide who will host the original data, and which other organisations need copies.

Backing up data

  • Original data should still be available in its unedited form as a backup, in case someone needs to check the data and code questions differently.
  • Backups secure the data against any unknown accidents or errors that may occur.
  • Backup at least weekly during the initial data entry and cleaning phase, then reduce the frequency later on.

Track changes

  • Recording any errors or changes made to the data allows for the process to be transparent. This is commonly referred to as a Log Book. This can also include creating field reports on difficulties encountered during data collection, concerning the survey, data entry, or field experiences.
  • During cleaning it’s also vital to ensure there is a consistent system for the naming and controlling of different versions.

Step four: Check data quality as it is collected in the field

Ensuring the quality of your data is an essential part of the data collection process includes checking that everything is running smoothly logistically, tracking the collected data, and checking field work activities. Checking data quality should occur simultaneously with your data collection in the field.

When should a field supervisor intervene?

The field supervisor should ideally check the surveys at the end of each day and initiate course correction immediately. Once a team leaves a survey location it may be difficult to resurvey. Supervisors should review the sample/surveys answers where possible to ensure that all answers have been filled in. Regularly checking in with the data collectors themselves to ensure that they are meeting quotas is crucial. If issues occur either with the survey answers or the data collector’s performance, the supervisor should look for why this might have occurred and provide guidance accordingly. The table below shows a list of important things to consider to ensure the quality of the data.

Regular data checks Data / dashboard managers need to check incoming data at regular and defined intervals (daily, weekly, etc).
Data quality assessment skills A skilled data quality auditor is required to isolate inconsistencies and ensure that the sampling breakdown is correct.
Data tracking system A data tracking system is used to check what comes in from the data collector. For a digital data collection, the dashboard manager can easily track the last time a data collector was connected to the dashboard. By setting up these systems during data collection, you will be able to track sample size and distribution and infer data quality during the data collection process.
Data triangulation and quality checks Data triangulation is a powerful technique that uses a combination of questions and question types to ensure data quality and validity. For example, a series of questions with region, district, and village coupled with a GPS location question can ensure validity and minimise the risk of data entry error.

Consider using a tool such as Akvo Lumen, Carto, or another mapping tool to check that the points are in the correct locations.

You can also use Microsoft Excel to filter and sort answers to find outliers and inconsistencies in the data. Checking data quality during data collection is a skill set of its own, to find out more check this FAO article: Data Quality Assurance. This will also be discussed in more depth in phase seven of the Handbook.

Feedback and cross checking process Set up a process for feedback and the cross checking of data collected with the data collector on any errors so that they can notify the data managers ahead of time.

Important: avoid changing the survey during data collection. It can be tempting to make small adjustments, especially with mobile data collection. This will endanger consistency.

Documentation For reporting purposes, keep track of errors found with data entry, difficulties encountered during field implementation, and the steps taken to resolve them.

Additionally, it is best practice to maintain a codebook to keep track of the editing and/or coding of data. This is also touched upon in phase seven of the Handbook.

Step five: Finalising data collection

In order to finalise the fieldwork, you should perform some final checks on the activities and create reports to document field operations.

  • Check that all data collectors have handed in or uploaded their survey forms and that quotas been met.
  • Check the sampling and numbers of data points collected per locality. Has anything been missed and will there need to be further data collection?
  • For paper-based data collection, archive all survey forms consistently and digitise them. For digital data collection, download raw data reports and store them in your system.
  • Create reports on difficulties encountered during data collection, initial results of the data collection, considerations regarding data quality, and the actual sampling in the field against the original sampling method.


The better you have prepared your survey, planned the data collection process, trained the data collectors, and resolved impurities in the data on the go, you'll find data cleaning to be much easier. In addition, the analysis and visualisations will be more valuable. Without the appropriate data, you will not be able to calculate the indicators you need, endangering the whole purpose of the data collection project. Not checking data quality on the go will impede getting results out fast and can lead to a long, cumbersome process of data cleaning.

The next phase of the Handbook will focus on how to clean, analyse, and create meaningful visuals from your data.

Suggested reading

A useful guide which is easily accessible is the UNDP Pulse Lab (Enumerator Guide).


Authors: Nikki Sloan (Akvo.org), Camille Clerx (Akvo.org), Stefan Kraus (Akvo.org)
Contributors: Annabelle Poelert (Akvo.org), Karolina Sarna (Akvo.org), Rajashi Mukherjee (Akvo.org), Uta Wehn (IHE Delft Institute for Water Education)


The Africa-EU Innovation Alliance for Water and Climate (AfriAlliance), is a 5-year project funded by the European Union’s H2020 Research and Innovation Programme. It aims to improve African preparedness for climate change challenges by stimulating knowledge sharing and collaboration between African and European stakeholders. Rather than creating new networks, the 16 EU and African partners in this project will consolidate existing ones, consisting of scientists, decision makers, practitioners, citizens and other key stakeholders, into an effective, problem-focused knowledge sharing mechanism.
AfriAlliance is lead by the IHE Delft Institute for Water Education (Project Director: Dr. Uta Wehn) and runs from 2016 to 2021. The project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 689162.
EU flag RGB.jpg