Technical efficiency analysis of coffee production
Data files
Apr 15, 2026 version files 63.44 KB
Abstract
In the international economy, coffee is one of the most important cash crops. But, because of socioeconomic and technical restrictions, it is defined by poor productivity. Modern technology integration with more efficiency is becoming increasingly vital in resolving this challenge. To fill the gap, this study was conducted to measure the level of technical inefficiency and identify its determinants. We develop one dependent variable and two independent variables. The technical efficiency of coffee production in the study area was measured by considering the output obtained per household head as the dependent variable, as well as two independent variable sets. Those are:
1. The set of variables to measure the elasticity of coffee production in the study area.
2. The set of variables identifies factors that may contribute to the technical inefficiency of coffee producers in the study area. Cross-sectional data were acquired from 285 randomly chosen households as part of the study's basic random sample approach. The Cobb-Douglas production function with a stochastic frontier was used to estimate the coefficients.
The Cobb-Douglas functional form is highly efficient in terms of degrees of freedom and is suitable for interpreting the elasticity of production. Therefore, in this investigation, the Cobb-Douglas functional form was favored. As a result, the output elasticity of the explanatory variable coefficients used in this study to measure the elasticity of coffee production in the research region was greater than 1. This indicates that doubling the inputs more than doubles the output, demonstrating that returns to scale are boosted because it is greater than one.
Dataset DOI: 10.5061/dryad.5mkkwh7fp
Description of the data and file structure
Principal Investigator Contact Information
Name: Alemu Olika
Institution: Wollega University ^ Development Bank of Ethiopia
email: alemuolika2015@yahoo.com / alemuo@dbe.com.et
Alternate Contact Information
Name: Gemechu Mulatu
Institution: Wollega University
email: gemechumu@wollegauniversity.edu.et
Dataset Overview
These data were collected from the coffee producers’ farmers in the study area. A few kebeles were chosen for the research. Before data collection, the following considerations were done:
• The purpose & the importance of the study were explained for the participants of the study. Then, the respondents were orally informed that they have the right to participate or not in the filling the questionnaire. Thus, the participants in the study were participating in the study by only filling out the questionnaire.
• Oral communication was used to explain to the sample responders that the data-gathering procedures should not cause confusion & harm participants. Clear and impartial preparation went into creating the questionnaire.
This study did not use experimental subjects based on humans or animals. It was only a technical efficiency study of farmers' coffee producing practices.
The enumerators were trained in the data collection procedures. In the study, cross-sectional household data from the 2021 main harvest cropping seasons were used. Data for input (such as land, human labor, fertilizer, coffee plants, and herbicides) were used, and the output of coffee production was collected from a specified period of time. Data on input use and output were collected in local units and converted into standard units. In addition, primary data were collected by interviewing the selected coffee producers’ farmers and variables that cause variation in production efficiency, such as age, education, household size, extension contact, and gender. In addition, socioeconomic variables such as demographic data, credit access, livestock holdings, wealth indicators, and institutional data were collected. On the other hand, data related to coffee production trends, input supply, and extension services are gathered to clarify and support the analysis and interpretation of primary data.
The questionnaire has been printed after it has been approved by the College of Business and Economics Research and Technology Transfer Associate Dean of Wollega University. The researcher personally visited the selected smallholder farmers at coffee bean collection and harvesting time and kindly encouraged them to fill out the questionnaire objectively without any biases.
Sources of Data
This data was prepared to study the technical in/efficiency of coffee production. Thus, as the primary data the data was collected from the selected farmers, those currently participate in coffee production. Therefore, it can desrcibed as both quantitative and qualitative data, as well as primary and secondary sources. The primary data were gathered using a structured questionnaire. In the collection of data, a structured questionnaire was developed and evaluated. The questionnaire was refined and modified based on the pre-test input. The primary data collection process was conducted by the enumerator, the district’s development agents, and the researcher. This data was also gathered from governmental and non-governmental institutions, published and unpublished documents, websites, and other relevant sources for analysis and descriptive purposes.
Dates of Data Collection
- Primary data collection - 2021
- Secondary data collection - 2021
Approximately 31,610 farmers from 30 kebeles represented the district's entire coffee-producing population. During the second phase, four kebeles belonging to the main coffee producers were purposively chosen from these kebeles because of their sizable coffee fields and the necessity of determining the districts' most and least productive coffee-producing areas. There were 1108 people living in these four kebeles. The third stage was the random selection of 285 samples using the Kothari formula.
Declaration of Funding
The authors did not receive support from any organization for the submitted work. No funding was received to assist with the preparation of this manuscript. No funding was received for conducting this study at all.
Description of the data and file structure
The primary raw data from the chosen study area was entered into an Excel spreadsheet called "Raw_Data_s_of_Technical_Efficiency_Factors_(Re_Collected_Data).xlsx" and shared to this data repository. Additionally, Description_and_Meanings_of_Variables: in this there is one dependent variable called total output and two types of explanatory variables. We used this 1st numbering (first independent variables) output as dependent variables and the rest five variables (landcoff = land where the matured coffee covered, laborcoff = labor force for coffee production, Coffplant = matured coffee plant those starting to give a coffee bean, orgfert = organic fertilizer and herbicides) as explanatory variables to measure the elasticity of coffee production in the study area.
The second ((second numbering) second variables) set of 12 explanatory variables (ageofhhh = age of household head, sexofhhh = sex of household head, educofhhh =educational level of household head, hhsize = size of household, tlu =tropical livestock unit, offincome = off/non-farm income, totcultland = total cultivable land, totlandfrg = total land fragmentation, avrgplotdist = average plot distance, extcontact = extension contact service, train = training to farmer, credit = credit service for farmer) are factors that may contribute to technical inefficiency of coffee producers in the study area. The Description_and_Meanings_of_Variables xls sheet has a thorough description of each and describe unit of measurement applicable in the study was explained.
Word document uploaded as: Questionnairs_for_Technical_Efficiency_Analysis_of_Coffee_Production
These are the word documents that we utilized to get respondents' raw data. It comprises the six components of the raw data collected from the respondents. Part I: General information about sample farmers; Part II: Economic information; Part III: General information about coffee farming; Part IV: Fertilizer & Chemicals (Herbicides); Part V: Extension service and training; and Part VI: Credit service
Word document uploaded as: Sources_of_data_Sampling_Technique__and_Sample_Size
These documents provide data sources, sampling methodology, and sample size calculations. In general, to explain where the data were collected, the selection of sample household heads, and the calculation of sample size.
Figures:
Figure 1: Graph of Input-oriented measures for technical, allocative and economic efficiencies
Figure 2: Graph Technical Allocative and Economic efficiency through output oriented measurement
Figure 3: Sketch of Conceptual Framework of the Study
Figure 4: Map of the Study Area; location of the kebeles where the data was collected
Figure 5: Skewedness of Farmers Technical Efficiency
Files and variables
Questionnairs_for_Technical_Efficiency_Analysis_of_Coffee_Production.docx
Description: The structured questionnaire designed to collect the primary data from respondent.
Raw_Data_s_of_Technical_efficiency_analysis_of_Coffee_production_dryad_Final_edited.xlsx
Description: It contains the collected data in a suitable form for Stata software. In these there is one dependent variable and two sets of independent variables.
Variables
- totoutput = total output, which is employed in the study as a dependent variable. The amount of coffee yield obtained in one season of the production year.
- (landcoff = land where the matured coffee covered, laborcoff = labor force for coffee production, Coffplant = matured coffee plant those starting to give a coffee bean, orgfert = organic fertilizer and herbicides) as independent variables to measure the elasticity of coffee production in the study area.
- The second variables set of 12 idependent variables (ageofhhh = age of household head, sexofhhh = sex of household head, educofhhh =educational level of household head, hhsize = size of household, tlu =tropical livestock unit, offincome = off/non-farm income, totcultland = total cultivable land, totlandfrg = total land fragmentation, avrgplotdist = average plot distance, extcontact = extension contact service, train = training to farmer, credit = credit service for farmer) are factors that may contribute to technical inefficiency of coffee producers in the study area.
Figure_1_Input_Oriented_measures_for_Technical_Allocative_and_Economic_Efficiency111.tiff
Description: The graph of Input-oriented measures of a farmer's efficiency that use two inputs (Z1 and Z2) to produce a single output under the assumption of continuous return to scale are depicted in Figure 1 Z1 and Z2, the two inputs, are displayed on the vertical and horizontal axes, respectively. KK' is an isoquant of a completely effective company. Each point on this isoquant represents technically efficient manufacturing. Let's say a company is operating at point X in Figure 1, producing at the same rate as the fully efficient farmers.
Figure_2_Technical_Allocative_and_Economic_efficiency_through_output_oriented_measurement
Description: The Graph The output changes that a company may accomplish with the same amount of inputs are the main emphasis of output-oriented measurements of efficiency. The idea of outcome-oriented Figure 2 may be used to show the efficiency measures of a company that uses one input to produce two outputs (let's say Y1 and Y2). The horizontal and vertical axes, respectively, indicate the two outputs, Y1 and Y2. The production possibility curve, or SS*, displays many combinations of two outputs (Y1 and Y2) that can be generated from a certain level of input (Y1). An effective technique from a technical standpoint is the SS production possibility curve. Technically efficient firms are those that are generating at this curve. Technically speaking, a company producing at point C is inefficient since it is below the production potential curve (SS), which shows the maximum amount of possible output.
Figure_5_Skewedness_of_Farmers_Technical_Efficiency
Description: Stata output graph.
Figure_3_Conceptual_Framework_of_the_Study
Description: Conceptual framework of the study described as a web of connected ideas that, when taken as a whole, offer a thorough comprehension of a situation. Stated differently, it is a written or visual result that provides a narrative or graphic explanation of the primary subjects of study. The conceptual framework of this study is based on the new institutional economics' institutional assessment and growth technique.
Figure_4_Map_of_the_Study_Area
Description: The map of the area being studied shows the kebeles that make up the study geographic area. A few kebeles were chosen purposively because of their large coffee farms and the need to determine which areas of the district were most and least productive for producing coffee.
Description_and_Meanings_of_Variables.xlsx
Description: uploaded to explain the meanings of the data's variables one by one. It includes an explanation of each of the two sets of independent variables in the data as well as the dependent variable (output).
Variables
- totoutput = total output, dependent variable.
- set 1. Independent variables that measures the elasticity of coffee production in study area
- set 2. Independent variables that may cause technical inefficiency of coffee producers in study area
unit of measurement
- 1Quintal of coffee yield = 100kg
- 1hectare land = 10000m^2
- 1day working hours = 8 hours
- Credit = is the type of loan which will obtained from local credit service in Ethiopian birr
- This data is Cross-Sectional Data
- This Data is Collected 2021 Production year
- TLU = Tropical Livestock unit ; Estimation of Livestock owned by farmers; (by tropical livestock unit conversation). It is an estimation amount. Because cattle can be die, sold etc...
- Sex = Sex of the household head a dummy variable. It coded with a value of 1and, 0.
- HH Size = It is not the entire family member
Sources_of_data_Sampling_Technique__and_Sample_Size.docx
Description: These documents provide data sources, sampling methodology, and sample size calculations. In general, to explain where the data were collected, the selection of sample household heads, and the calculation of sample size.
Code/software
In this work, a cross-sectional dataset including 285 respondents was utilized for econometric research to estimate the combined frontier inefficiency model. The many factors influencing the productivity efficiency of coffee growers were estimated using the Stata 15.0 version software package (StataCorp LLC, 2017)
Access information
Other publicly accessible locations of the data:
- None
Data was derived from the following sources:
- Agricultural and Rural Development Office
- Coffee, Tea and spices Development Office
- Primary data was collected within a structured questionnaire by trained enumerators and researchers from the district’s development agency employees and from the selected farmers of the study area.
Human subjects data
To the greatest extent feasible, this data has undergone thorough de-identification. Names, addresses, email addresses, and other direct identifiers have all been permanently deleted, along with all other personally identifiable information (PII). Furthermore, even with the use of easily accessible information, the data has been processed to remove any chance of reasonably determining an individual's identity.
The study used both primary and secondary data as well as quantitative and qualitative data. The primary data was collected using a structured questionnaire. For this study, a structured questionnaire was designed and pre-tested. The feedback from the pretest was used to refine and modify the questionnaire. The process of primary data collection was held by the enumerator, the district’s development agents, and the researcher. The enumerators were trained on data collection procedures. In the study, cross-sectional household data from the 2021 main harvest cropping season were used. Data for input (such as land, human labor, fertilizer, coffee plants, and herbicides) was used, and the output of coffee production was collected from the specified period of time. Data on input use and outputs were collected in local units and converted into standard units. In addition, primary data was collected by interviewing the selected coffee producers’ farmers and variables that cause variation in production efficiency, like age, education, household size, extension contact, gender, and the like. In addition, socio-economic variables such as demographic data, credit access, livestock holdings, wealth indicators, and institutional data are collected. On the other hand, data related to coffee production trends, input supply, and extension services are gathered to clarify and support the analysis and interpretation of primary data. There was close supervision by the researcher during data collection so that errors, if any, could be corrected at the earliest possible time. Besides primary data, this study used secondary data from governmental and non-governmental institutions, published and unpublished documents, websites, and other relevant sources for analysis and descriptive purposes.
