Workshop on Validation of a Measure of Household Hunger for Cross-Cultural Use

U.S. Department of State Third Annual Conference on Program Evaluation - Health Track
Washington, DC
June 9, 2010

Get Adobe Reader View slide presentation ]

MS. DEITCHLER: Well, thank you very much for that introduction. Maybe it's useful to begin by telling you a little bit about where I come from, FANTA�'2. It's a USAID cooperative agreement. And we work to improve nutrition and food security programming, strategies, and policies through technical assistance to USAID and its partners, including implementing agencies, non�'governmental organizations, international organizations and host country governments.

And what I do there is I focus on research and measurement issues. So it's probably no surprise that I'm presenting on measurement of food security and in particular, the results of a recent study that we just published, actually to present this work, which is a validation study for a household measure of hunger that's appropriate for cross�'cultural use.

So to begin, what I would like to do is kind of have a common understanding of food security. And I think we all have a general notion of what food security is, but I think it's good to have a common definition. And what I have here is the USAID definition. And what I would like is for us to look at it from a measurement perspective because sometimes, you know, we have these complex outcomes that we're interested in measuring, but it's really useful to look at a common or established definition and think about it from a measurement perspective.

So USAID defines food security as when all people at all times have physical and economic access to sufficient food to meet their dietary needs for a productive and healthy life. So to measure that, it can be quite complex, and what usually happens is people divide it into three different elements: food availability, food access, and food utilization; food consumption.

And so here in this figure, I've just kind of broken it out to show at what level each of those elements of food security are generally measured and what the standard method is for obtaining that measurement. So at the national level, we are usually looking at food availability, and oftentimes an FAO method, which uses food balance sheets, is used.

At the individual level, we have anthropometric indicators and �'�' to look at food consumption and utilization issues. And these are both well�'established methods, and they allow us to have a comparable measure where we can compare the measure across, whether it's individuals or nations or regions, in the case of anthropometric indicators.

In terms of household food access, we have less standard measures. We �'�' there are a bunch of different methods that are available, but there really isn't one standard that has been adopted. And the methods that are available don't meet this criteria of comparability, of being comparable from region to region or nation to nation, for example.

So this is �'�' this figure shows the FAO measure of food availability mapped to different nations. And the reason I have this is I just want to point out that food availability only gets us so far. We can look at this and this provides useful information, but we can't get deeper, right? We can't get deeper into what's happening within these countries.

And conversely, individual measure of anthropometry, for example, well, we saw that household food access is actually a prerequisite, right, to having good consumption or utilization. So we're really missing a critical piece if we can't get in there and understand issues related to household food access.

Without the information, we're unable to understand who is affected by food security, to what extent households are affected by food security, and how households are affected by food security. And in the current environment, I think there is recognition that this is becoming more and more important to have this understanding in light of things happening with climate change, in light of the 2008 global economic crisis, and also with the Feed the Future initiative that is now happening.

So this is where we have focused a lot of effort. Beginning way back in 2000, we started some work to try to identify a measure of household food access that would allow for comparable measurement. And these are some of the criteria that we set out for some method or measure or tool that we would develop.

First, of course, we wanted it to have good cyclometric properties. With any type of measure or method, you want it to have valid cyclometric properties. We wanted it to be meaningful, right, and to reflect what it is that we're actually trying to measure to be reliable, accurate, and at the same time be simple. We don't want something to be too burdensome.

We wanted it to reflect the current situation so that you could make timely decisions around it and it would be relevant for programming. And also, of course, you know, back to that FAO map, we wanted to be able to look within countries, to be able to obtain data about communities or be able to look at rural areas and urban areas separately.

At the same time, we also wanted to be able to aggregate data up. Of course, from a donor perspective, USAID missions, USAID, State, there is also accountability issues for those organizations reporting to Congress. We need to be able to roll information up and report on success. And to be able to do so, you have to have a comparable measure. You can't compare �'�' you can't aggregate data up if it's not meaningful in the same way across different regions and settings.

So that's the last criteria here, a comparable cross�'culturally valid measure. And I just want to �'�' I want to spend a little bit of time to focus on what this actually means because this is oftentimes overlooked in tool development or method development, but it's actually �'�' well, this is partly the reason why it is overlooked is because it is such a high level requirement.

To have a comparable cross�'culturally valid measure means �'�' this generally means that the same method or tool is going to have to be applied in all areas and that it is meaningful in all areas. And then in addition to that, and this is the hard part, that whatever measure you obtain, it means the same thing across areas.

So that's a really hard requirement to meet, and it's oftentimes not even tested, it's just assumed that data are meaningful in the same way when you use the same method or tool. And particularly with a complex measure or phenomenon that we're trying to measure, like food security, it can't really necessarily be assumed that it is meaningful and that whatever tool that you develop is meaningful in the same way in these different cultures and context.

So this was really a challenge that we were setting out to try to help address. But if we could address it, the payoffs are really big to have, a cross�'cultural measurement tool. It allows for decision�'making, to prioritize areas for intervention. It enables multi�'regional, multi�'country, cross�'country evaluation of policies and programs, and it's really useful for policy advocacy. You can track things globally and see how things are moving. You can look at a global situation and, you know, determine if there needs to be more momentum or more action, more resources devoted to addressing the issue.

So beginning in 2000, we worked to develop the scale. And I'm not going to go into too much detail about all of the background. What I really want to focus on is what is labeled here as Phase II. This is the phase that we just recently completed, which included the validation of the scale that we developed and things.

But just briefly about Phase I. This phase began in 2000. And if you're interested in details about this phase, this is written up in a number of publications. Paula is in the room. Paula is actually an author on a couple of those, I think, and oversaw the technical work for those.

But when we began in 2000, we looked to �'�' we were inspired, I guess, by the U.S. method of collecting data on food security. And I don't know how many of you are aware of that, but USDA and other U.S. organizations actually collect data on food security in the U.S. And it's an approach called The Household Food Security Survey Module.

And this method or approach is based on the idea that you ask the respondent in the household directly about experiences that are known to be related to food and security. The answers to the questions are summed. It's like a scale, right, and then this score reflects that household's food security measure. And then there are cutoff points across this scale to determine if the household falls in a very food insecure category or food insecure, et cetera.

So we were inspired by this approach and we wanted to know will this approach work in a developing country context asking respondents directly about food and security experiences and using a scale to do this. So we undertook some in�'depth qualitative work. This was in collaboration with Cornell and Tufts University, Africare and World Vision and Burkina Faso and Bangladesh developed �'�' from this qualitative work identified questions or experiences that seem to be related to the experience of food and security in those cultures.

And then through a period of consensus, a process of consensus building with academics, non�'governmental organizations, donors, we worked to identify the questions that could be potentially used in a scale across all regions and settings to measure household food access.

So this is the scale that we came up with. It's nine items and it's based on a four-week recall. So the person who is generally responsible for the food of the household is asked those nine questions and asked if the household experienced each of them. And then for those questions that the household replies yes, we experience that, how often did it happen. Was it rarely, sometimes or often in the last four weeks.

And the theme that I want to point out about these nine questions is that they were intended to reflect three domains that are kind of believed to be universal experiences related to food and security. And this includes anxiety or worry about the food supply. So that's item number one. That's reflected there. And then a reduction in the quality of the food. And that's items two through four. And then a reduction in the quantity. And in this scale, it's items five through nine. Okay. So that's kind of what I want to point out there.

So now I'm going to get to the validation part, which is really what I want to focus on. After we released this scale �'�' that was in 2006 �'�' several organizations adopted the scale and used it in their programs. And then they were kind enough to share with us the data that they collected so that we could use that for our validation.

And then these are the �'�' so we had seven datasets that we were able to use. And I have there the year the data were collected and the sample size. And we used a method for our validation called Rasch measurement models. And I don't know how many of you are familiar with that. I'm not going to go into details about them. I'm just going to tell you what you need to know to understand the rest of my slides.

But this is basically �'�' it's an item response theory method where if you have a scale, you're seeing how the items work together and you're ensuring that they have valid cyclometric properties. And so it's what the USDA has used to validate their measure. And so we also used it to validate our measure.

And this figure �'�' it's probably good if I use the pointer for this. This figure is a good illustration of how Rasch analysis works. Rasch measurement models assume that what it is that you're trying to measure, it exists in more severe forms and less severe forms. So this makes sense, right. We're trying to measure food and security, household food access. We're assuming it exists in more severe forms and less severe forms.

So then what it does is it takes the responses from households to the various items that comprise the scale and it attributes �'�' based on the item responses, how households respond, it attributes what I'm calling severity parameters, okay, to each item. And so that makes sense. We're assuming also that the items that make up the scale also are more and less severe experiences related to household food access.

So on the right side of this figure, this is the more severe and this is the less severe. So less severe generally have the negative calibrations and then as you get more severe, they're going to have higher and positive calibrations.

These are arbitrary numbers, by the way, just for illustration, but really when the scale was created, it was believed that item one was probably the least severe, that was the worry, remember, and item nine was probably the most severe, that was going a whole day and night without food. So this could, theoretically, reflect kind of the calibrations that could come out from the Rasch measurement model.

So the other thing that happens with Rasch is that the individual households also are calibrated to have a severity parameter based on how they responded. Well, that makes sense. And they can be placed on the same scale with the items. So this household, for example �'�' they responded to all nine items, right, but this household has a severity parameter of negative 2.9. And based on where it's placed and the severity parameters of the items, we would expect that this household had said, yes, I experienced items one, two and three, but no, I didn't experience these items. Okay. Does that make sense?

Okay. So these were the things that we looked at in our validation work. I'm not going to focus on internal validity. That's just basically it has good cyclometric properties. We will get a little bit into external validity, that it's measuring what we think it's measuring, that it's reliable, accurate, but again, I'm not going to focus too much on that because these are just kind of basic criteria for measurement, any measurement indicator.

What we're really interested in is the cross�'cultural validity aspect because that is what I think is kind of novel about this study and what is probably the most important takeaway. And we wouldn't have advanced to cross�'cultural validity if we didn't have the internal. It just doesn't make sense, right. Those are prerequisites.

Okay. So here is just another theoretical graph. And here what I've done, this is an example of what perfect cross�'cultural equivalents would look like. Okay. Remember we had seven datasets in our validation study. So one dataset is plotted here and the other dataset is plotted here. And then remember those severity parameters for the items, the nine items, these �'�' they are plotted, one dataset against another.

So if they have the same severity parameter, they are equivalent, okay, and they would be along the diagonal because that means that, you know, if this is a negative five, item two is a negative five in this population and negative five in this population, which essentially means they're equivalent.

There are more than nine dots here because remember we had frequencies. We had the never, sometimes, rarely �'�' never, rarely, sometimes, often frequencies. So those also get severity parameters. So that's why there's more than the nine dots.

Okay. So I'm going to take a shortcut here. The HFIS that we developed was nine items, remember, and four frequencies: never, rarely, sometimes, often. It was not cross�'culturally valid. It didn't work. Okay. So what I'm going to show you instead, what I'm going to focus on, is two �'�' what we did is we used the results from the analysis of the HFIS to see well, is there some configuration of items and frequencies that could work starting with this nine item, four frequency scale. We looked at different subsets of items and frequencies using the results.

And these are the two scales that came out best for the cross�'cultural validity. One was a five item, three frequency scale and one was a three item, three frequency scale. We had a lot of problems with the “rarely,” which, you know, in retrospect, everything seems kind of obvious. But “never,” “rarely,” “sometimes,” “often,” it's a little bit too finely discriminated for people, I think, to respond the same way in different cultures. “Rarely” in some languages is kind of hard to define. It's not so surprising really.

So what we found was if you combine the “rarely” and the “sometimes” together, if you aggregate that into one frequency, it works much better in terms of cross�'cultural comparability.

QUESTION: Was it just the cultural issue or was it also socioeconomic?

MS. DEITCHLER: It could be both. We can't really distinguish that very much from the work we did. I mean, that's an interesting question; it could be both. And I guess I'm using cultural to reflect any difference in setting. So it would encompass that, yeah.

So these are the two scales that we focused on more in�'depthly. And what I want to point out here is remember the nine item scale we started out with, we tried to have three domains reflected: the anxiety, having a reduction in the quality of the food, and having a reduction in the quantity of the food.

Well, if you look at these two different scales, both of these only focus on the quantity aspect. So we're really looking at food deprivation here. Interesting. We also collected qualitative data from our collaborators who contributed the data to the study. And the qualitative data we obtained from them was any problems in administering the scale or translating the items into the local languages. And issues related to quality were very hard to translate. And maybe not surprisingly, you know, it was very hard to come up with the wording in different contexts.

So I will focus on these two scales. I'm not going to focus on this slide because it's not related to the cross�'cultural validity, but I'll focus on this one. Here you'll see I have �'�' this is like we looked at earlier, right, the cross�'cultural �'�' the theoretical plot that I showed for the cross�'cultural validity of the nine items. Here we have the five items and the frequencies plotted.

And I just picked some examples. I'm not going to show you all seven datasets, but this is Mozambique in 2006, and then we also had a Mozambique survey conducted in 2007. And I have this red line plotted so you know what perfect cross�'cultural equivalents would look like. This, to me, looks pretty good. We were pretty happy when we saw this.

But then I'm also going to show you the worst performer, and that was Zimbabwe. So, you know, the other datasets that we had, there was pretty good cross�'cultural equivalents, but then there were a couple of outliers, Zimbabwe and Kenya. They just kind of had this all over type pattern. So cross�'cultural validity didn't hold for all datasets for the five item, three frequency scale either.

But now I'm going to move to the three item three frequency scale. And here is Mozambique again. It looks pretty good. And then there is Zimbabwe. It cleaned it up pretty well. There is still, you know, a little blip kind of for the question seven, step two, but it's pretty good. And actually what's more important are these graphs. Here again, I have Mozambique against Mozambique round two. It's the three item, three frequency scale. And what I've done here is rather than plotting the items, this is the household measure.

Remember we have a measure we get. Remember that figure when I was explaining the Rasch analysis, the items get severity calibrations and the households do. So this is actually plotting the household severity measure, and it's plotting it for each scale score. So we have a three item three frequency score. It can range from a scale score of zero up to six. And zero and six are hard to calibrate.

So I have one, two, three, four and five plotted. And perfect cross�'cultural equivalents, again, would be right along this line. And this looks pretty good. We were pretty happy with that. And then you can see Zimbabwe against Mozambique. And again, it's really good. We were really happy with this.

So in this figure, I've also added �'�' you can see two horizontal lines and two vertical lines. And these are at the same placement, right? I have it at negative three here, negative three here, severity parameter. And also this is .75 and .75

And what we wanted to do was to create a categorical variable. Okay. We didn't just want this scale score ranging from zero to six because oftentimes for monitoring and evaluation, right, it's really useful to have a categorical variable, a dichotomist variable or maybe a three category variable. So you can look at those who meet a certain criteria versus those that don't. It's easier sometimes to set targets, et cetera, et cetera.

So these are where we set our thresholds for our category. So we have a three category indicator. And what we really want to look at here is that the household measures fall within the same boxes, right. They're not only against the cross�'cultural equivalents line, but also, and it should follow, that within each category, the same scale score is reflected. And that's true, and it was true for all datasets. So we felt really good about that.

The next thing I'm going to move into now is kind of just trying to highlight the relevance and utility of this new indicator that we validated. We called it The Household Hunger Scale because we wanted to change the name from what it was previously because the previous name really highlighted that it was a measure of household food access. And because we're really focusing on food deprivation, because that's the only �'�' those were the only items that were found to be cross culturally equivalent, we wanted to make sure the name for our scale actually reflected that.

So we changed it to The Household Hunger Scale, and we had the three categories I told you that we developed based on those thresholds, and we named them little to no household hunger, that's a score of zero to one; moderate household hunger, that's a scale score of two to three; and severe household hunger, that's a scale score of four to six.

And then here are �'�' I have six of the seven datasets here. I excluded Kenya because Kenya was propulsive sampling. So it's like if I had presented the results for Kenya, I think it would have been a little bit harder to understand what it meant because it was an intentional sample. It wasn't representative of anything.

So you can see here that the results are in a policy relevant range, which I think is interesting. A lot of people �'�' the initial reactions, oftentimes, when we show people what the scale is comprised of, the items, it's, oh, well, those are too severe of experiences. No one is going to say yes to those experiences. Well, actually across all of these environments, they did say �'�' there was yes. And there is severe household hunger based on our categorization of it, and it is in a range that is relevant for policy and programming decision�'making and targeting. And our results were pretty much along the lines of what we expected in terms of comparing between different contexts and countries.

So here I'm just going to show a couple of cross tabulations that we did with other variables that were in datasets. This is Mozambique again. And all I've done here is I've plotted on the Y axis, this is the proportion that are in each household hunger category, and then this is just a variable that the collaborator who collected this data also derived, which was a household wealth score.

So as household wealth score goes up, we would expect the proportion that had severe hunger to decrease, which is what we see, and likewise, as household wealth score goes up, we would expect little to no hunger to increase, which is what we saw. So I mean, there is a couple of blips in there, but you have to remember when you look at each house �'�' we're looking by household wealth score. So some of these sample sizes are pretty small. That we have this good of a kind of linear trend, I think, is actually �'�' we were happy with it.

Here is another example. This is West Bend Gaza Strip in South Africa. Here both of these datasets had a variable to reflect the median monthly household income by consumption unit. So I just plotted what the �'�' well, sorry. It was just monthly household income by consumption unit. So I plotted the median household income by the different household hunger categories. Again, you see a decrease as we would expect and want. And it's a pretty significant decrease across the different categories. And I think it's also striking how similar the patterns are. That was just coincidence, lucky.

And then finally �'�' this is also important, right, when we're monitoring and evaluating, we want our indicators to be sensitive to change. We want to be able to detect change and understand what's happening. And we only had one dataset that allowed for �'�' or it was two datasets, but we had one incidence where we could look at trends across time. And it was a really interesting scenario because we had data from two provinces in Mozambique. And one of �'�' and it was pre and post harvest. And one of the provinces was severely affected by a cyclone and other climactic events and the other was not. So we expected that there would be a difference in the type of change we would see over time.

And you can see in �'�' okay. When you have pre/post�'harvest, obviously we're expecting �'�' this is moderate to severe hunger. We're expecting it's going to decrease, but we would probably expect it to decrease more. But this was the province that was severely affected by the cyclone.

So there is a slight decrease, but really, it's probably not even significant decrease. It's probably �'�' it's really �'�' things kind of remain constant over time. Whereas in this province, there was a substantial decrease. And this was the province that was not really affected by the cyclone or other climactic events. So that was, you know, that was further supportive that this really could be a useful and relevant indicator.

QUESTION: So what did �'�' so they didn't get a cyclone, but what did change?

MS. DEITCHLER: It's post harvest. So it was collected at a different time. Right. So we're looking at food deprivation. So pre�'harvest, it was kind of at the end of the food supply, whereas post harvest, there should have been plentiful food if the harvest was good.

QUESTION: So then a few months later, this would go back up again?

MS. DEITCHLER: Yeah, potentially. That's what one might expect, yeah, unless there was a program intervention happening that was really effective to mitigate that. Right. But there is this seasonal type of pattern that you would expect to see in areas that are heavily agriculture pre and post�'harvest.

Okay. We had lots of limitations to this study, like any research, and so I have them listed here. I don't know if I'll go through all of them, but maybe I'll highlight a few. The datasets that we used, as you saw, it was what was made available to us. And this happened to be mostly datasets from Southern and Eastern Africa. We have, you know, also the dataset from West Bank and Gaza Strip. But we didn't have any, for example, from Latin America or other regions.

So, you know, while this scale appears to be cross�'culturally valid, at least for the seven settings where we had data, and we're hopping it has potential for broader cross�'cultural validity as well, this is something we're going to continue to look at as more data are collected.

There were sample size limitations. There are always some things, but things still looked pretty good based on what we did. There are certain assumptions that the Rasch model makes about your data. We looked at one of these very carefully. Another one of these is very difficult to look at for measures of food security and scales that are comprised of few items. So we really, with the analytic methods that are available right now, we can't look at it. That's something that we'll do in the future if some statistician helps us out and develops the appropriate methods.

Of course, when we were looking at the relevance in comparison to other dataset, or to other variables and other datasets, we were limited by whatever the collaborator had collected. And the other important thing, which actually has operational implications, is that, you know, when we were validating that three item, three frequency scale, the household hunger scale, we were validating data that were collected as part of a nine item, four frequency scale. We didn't go out and collect new data. We were still using the same data that was collected using that HFIS.

So there is a question. You know, if we were to collect data using three questions and three frequencies, would it still be valid. Would we still get the same sorts of results or are those answers to those questions somehow preconditioned on the earlier questions and the scale? So that's a question we're going to try to look at as well.

So these are our next steps. We are going to publish an operational guide for �'�' instructions for how to use this scale and how to collect data for it. We are going to continue to look at the cross�'cultural validity and the external validity. And I want to highlight some �'�' yes?

QUESTION: When do you expect to publish that?

MS. DEITCHLER: That's a good question. The guide is currently in development. I guess my hope, it probably wouldn't be published until, I want to say October is what we're probably aiming for, October 2010. We do, however, have �'�' the full report of this validation study was released just last week. So you can find that on our website, which really goes into much more depth than I'm describing here. And we're also, in the very near future, going to have a summary like a technical brief that's going to come out to also describe that work. And I would say that would be probably in the next four to six weeks.

And really one thing that I want to point out here is that there are important tradeoffs to make. It's really great, I think, to focus on an indicator that has cross�'cultural comparability. We need it for a lot of reasons. But it's important to also understand that it does sometimes come with a bit of a tradeoff. You make compromises in your measure so that you can attain that cross�'cultural validity of your tool or your method.

So it might not be the most sensitive measure in every culture or context, and in some contexts, there might be a very good measure of household food access, which is what we were aiming to achieve, available and validated. And so what I want to point out is, I think there is use for both.

You know, when a program is doing a problem assessment or a baseline or a final evaluation, if there is a method that's available that's been validated for that specific area, it's useful to collect that data, but then also, so that there is some comparable data that can be looked at and compared to other regions, other settings, and allow for this meta analysis by donors and missions and agencies and give some relevance to what it means for that context, relative to the broader geographic area, I think there is use for the three item, three frequency scale, the household hunger scale.

And then lastly, I just want to point out, as I said earlier, that really something to keep in mind is in the absence of cross�'cultural validation, one really cannot assume that method or a measure that is collected is comparable cross�'culturally. And that is just something I think that's worth keeping in mind.

And I think I'll also just highlight, as my kind of closing point, is that we recently learned that the indicator has been adopted for the Feed the Future initiative in the results framework for the increased resilience of vulnerable households and communities objective. So it probably �'�' there will be more data collected on this indicator, and hopefully this will be really useful for monitoring policies and programs, addressing both food security and health objectives. So thanks, and I'll welcome questions and comments.

MS. GORDON: Let me just, for you, take questions. Thank you. That was an excellent presentation.

We are recording this. So I just want to make a reminder if you have a question, if you're sitting at the table, if you can turn your microphone on. Otherwise, we have Suzanne who has a portable mike. We have a couple of minutes. Since we started a little bit late, we'll take a few questions and then we'll end.

QUESTION: That was really excellent. And I am very, very interested. We probably are all very interested in food security, but specifically, for the program that I am working with right now, which is the PEPFAR program, we have, across the board, been very intensively looking at issues of food security specifically for those who are HIV positive, as well as orphans and vulnerable children due to HIV/AIDS.

So I guess the question that I have is that �'�' the first one is: have you at all been working with Macro International in the demographic and health surveys in this? I know that the DHS right now does not have a food security indicator, but they do have a food security module. And I guess my second question is, how does this compare to their module, given that I am actually not all that familiar with what is inside of it.

MS. DEITCHLER: Well, I'm embarrassed to say I'm not so familiar with their module. I don't know what is in their module as well. I would �'�' I'm familiar with their SES measures, and I don't know if that's what is incorporated in their food security module.

We haven't been working closely with them on this and it partly might be related to timing. You know, they �'�' Macro International, they revise their surveys kind of once every five years. And they were revising kind of when we were still in the process of just finalizing our results.

So what I understand is, their core questionnaire has already been finalized. There is not really room for additional measures to be incorporated; however, if missions �'�' since the DHS are USAID funded, if USAID missions are interested in having this measure collected, you know, there are country�'specific elements that can make its way into the DHS upon request.

And I think Napal, for example, they're having a DHS underway right now. And I believe that they requested the HHS be included. And there may be one or two other countries as well. So there is potential for it, but it's not necessarily going to be the standard at least in DHS 6.

QUESTION: That's great. Thank you. We struggled with getting a food security indicator, at least one, even recognizing that one was probably not even enough, into the DHS last year because they revised it as of, like, July of 2009. So, you know, definitely not on the same time line. But we consider it to be so important to actually get this done. So it's great to know that this exists.

MS. DEITCHLER: Yeah. And the other thing related to HIV/AIDS that I would just mention is two of the datasets that were included in this validation study were actually collected specifically with HIV/AIDS affected populations. So, you know, if we're talking about cross�'cultural comparability, this isn't just about distinct population or linguistic groups, it's also about populations that may experience food and security differently, which HIV/AIDS affected populations could certainly fit into that category.

And so we have some, you know, empirical evidence that, at least for the two datasets that we had that were with HIV/AIDS affected populations, that has been taken into account in our cross�'cultural validity.

QUESTION: Have you taken these conclusions, the three item scale, back to the group of major stakeholders and sort of revalidated it with them, and was there any pushback on the fact that it was such a narrowed scale because people fought very hard for their items to be included? And then also, was there pushback on the cutoffs?

MS. DEITCHLER: We did not have a formal stakeholder meeting. Paula was involved in kind of the earlier process of the scale development, and that was actually before my time and my involvement in this scale. And I understand that during that process, there were kind of formal stakeholder consultation meetings.

The way in which we had, I guess, stakeholder consultation, it was more ad hoc. We �'�' this validation report was written in collaboration with two stakeholders, FAO and Tufts University, and we shared the report for comments with USDA who is kind of a more peripheral stakeholder, I guess, but certainly, you know, very wise and food security measures with all of their experience.

And yeah, but we felt like it would be difficult to have formal stakeholder meetings around the scale because it was so empirically driven there wasn't much room for negotiation, I guess, right, because it was really data driven, our results in what the scale was; there was no pushback on the cut points.

There was, however, some, I would say, resistance at the beginning. I phrase it maybe more as disappointment about that we didn't really have a measure of food access, which captures issues related to quality and anxiety. There was some nervousness around this, but if we wanted to obtain cross�'cultural validity, we had to narrow it down to food deprivation and those three items. And that was really what we were focused on.

So that's what we came up with. And that's kind of �'�' it was a bit of a circuitous path, but that's why, through these discussions and debates that we were having informally, that's really how we arrived at this conclusion. And I firmly believe it, that in those settings, if you do have a culturally specific measure of food security, use that, but use this in addition. Either one doesn't prohibit the use of the other.

And so that's kind of where that recommendation came from was this idea that well, it's really too bad that we weren't able to capture something more broad, that we are focusing on food deprivation, or household hunger, as we're calling it. Which is, in itself, important, but maybe just doesn't capture the broader experience.

But something else that I'll just point out that I think is very interesting to think about, if you look at our �'�' the proportion of households that are moderately and severely hungry based on the seven datasets that we analyzed, if you were going to incorporate those less severe experiences into your scale, you're going to have such high proportions that it's really hard to make decisions around that because you're going to be in the 80 and 90 percents. So I just think that's also something interesting to consider. But, yeah, good question.

QUESTION: So a bit of a technical question. On the �'�' when you assigned this 30 values on the Rasch scale, how did you actually get those numbers for the nine questions? Was that based on the data and was that assigned prior or if it was based on the data, it was assigned afterwards, and if so, was there a different Rasch scale for each country or was it one that you used for the overall study?

MS. DEITCHLER: Okay. Good. I think it will be easiest to go back. Technical help. Thanks.

Okay. This was theoretical. I made up all these numbers. So when the data were analyzed, each dataset was analyzed separately, and the Rasch model assigns severity parameters to each of the items and the households. So those are determined by the Rasch model based upon how the households respond to the items. And they were derived separately by dataset.

And then when I plotted them, I used the data specific calibrations for each dataset. Okay. So they're entirely independent. The only thing that I did was that I standardized them by the standard deviation so that they were on the same metric and whatever, but the measures were independently derived using the Rasch model on each dataset. Yeah.

MS. GORDON: I think I'm going to conclude us and we will have our coffee and networking break. So you'll have an opportunity to engage with Megan afterward.

But I just want to thank you for an excellent presentation and for your questions and for your patience as we went a little bit over. So if you can help me in giving Megan a hand.


MS. GORDON: Okay. So we have our break now and we'll be back at 11:00 for the next presentation. Thank you.