Dachshund dog and cat side by side

In the current Ed-Fi ecosystem, K12 student information systems have used Ed-Fi to integrate with both State Education Agencies (SEA) and Local Education Agencies (LEA) systems directly, most often by providing standardized, regular data feeds to an Ed-Fi API. In an increasing number of cases, SIS systems are working with both the state and a group of school districts in the same state. In other words, the same SIS is providing two data feeds at the same time: one to the state and one to the local district.

This situation raises the question: Is the data feed the student information system sends to the state identical to the data feed the student information system sends to LEA systems?

In the current ecosystem, most evidence points to these data streams having a few important differences. These differences occur because of important differences in context between the state and LEA needs. In this blog post, we’ll focus mostly on ways these SEA and LEA streams differ.

Enumerations: SEA vs LEA

SEAs are the agents of state and federal policy, which shapes their data requirements and focuses much – but not all — of their work on compliance reporting. Use of certain option sets — to classify data elements in areas like academic subject, grade levels, student discipline, etc. — are very often prescribed by state or federal policy. In such a context, these elements are often constrained to have a limited set of values.

However, at an LEA level, the focus is more strongly on operational excellence for classroom instruction. In this context, the LEA needs the freedom to define nuances in standardized classifiers, for concepts like academic subjects and even for ones like grade level. These differences from state values are important because they provide local adaptability and flexibility in how academic operations are managed.

When the SIS transmits such option sets in a data feed, context matters a lot. In local use cases, the LEA generally wants to preserve local values, but in state reporting, it needs to send the mandated standard state set. In its communications, therefore, the SIS has to account for two separate contexts where different option sets may apply. That results in an important difference in the data feed from the SIS.

Data Definitions and Context

In some cases, the context difference between SEAs and LEAs also results in a different interpretation of a data definition, and therefore a different value, for each context.

As an example, take AttendanceEvent.EventDuration, which is defined as “The amount of time for the [attendance] event as recognized by the school: 1 day = 1, 1/2 day = 0.5, 1/3 day = 0.33.”

The state, thinking of the need to comply with state policy, may restrict this value to: ‘1’ “0.5” or ‘0’ and not allow other partial values. In a way this may make perfect sense: the state context only observes granularity at the level of a half-day.

However, at a local level, schools may want a lot more granularity, and may want to observe very precise calculations of days – e.g. “0.125” or 1/8th of a day. In the LEA context, therefore, it may make sense to set the value of AttendanceEvent.EventDuration to “0.125” – a value not allowed at the state level.

There are other possible examples. For example, the state may not collect Pre-K attendance in certain contexts if Pre-K falls outside of the state policy. However, the local LEA will almost certainly want to monitor Pre-K attendance, and so expect such data to appear in its Ed-Fi feed.

One solution to many (but perhaps not all) such cases is for states to accept more granular, nuanced data, and then accept the work of translating that data into their context. That’s certainly the path we hope to see emerge, but it requires them to accept more responsibility and is not consistent with how they are accustomed to operating. But even then it may not be possible in all circumstances, such as the example of Pre-K attendance, as data not required by the state should not be shared with the state.

Aggregate vs Granular Data

SEAs are accustomed to asking for aggregate values, but LEAs often want granular data, and this leads to some differences in data feeds from SIS systems today. As an example, let’s consider calculations of “total instructional time” for an academic calendar day at a particular school.

The SEA often asks for an aggregate value in such as case, as the SEA is not focused on the details of bell schedules and doesn’t want to do all the calculations of to arrive at a value. Rather, they just want the final number.  As a result, in this case, the SEA extends the data model and adds its own field to capture this value, resulting in a difference between the LEA and SEA data streams.

However, let’s also consider the case in which the SEA is willing to accept and use the granular data of bell schedule. In such a case, it can still be difficult for the SEA, as it might force them to do interpretative work.  For example, say the school has two bell schedules for that day: one for grades PreK-K and one for grades 1-6, resulting in 2 possible values for “total instructional time”. Which one does the SEA use, when they only have space for a single aggregate value? Business rules can, of course, be defined to solve these issues, but those may not exist today.

There are also trickier cases in the ecosystem. We have seen cases in which states mandate elements of formulas for standardized calculations of instructional time. For example, in one case a state had a value they required for calculations of hallway “passing time” (the time students spend going from one class period to the next). In such a case, the SEA calculation for “instructional time” cannot be based on the actual bell schedule! In such a case, the state is effectively mandating a customized calculation (they have inserted their own values into the formula, regardless of what the actual bell schedule says).

Can These Streams Converge?

Based on the experience of our community, the Alliance believes that our community can indeed push the SEA and local data streams to become more similar, but that these streams are unlikely to ever be identical.

The SEA Case

The reason why these streams will likely never converge fully is that SEAs operate in an environment highly restricted by state and other governmental policies with regards to data collection and use. As the state education agency does not directly provide educational services for students, it is not reasonable to expect that the fine-grained details of student activity (such as are captured in Ed-Fi) should be shared with them. And indeed, such sharing is often understandably blocked by state policy. Doing so is a mechanism to protect student privacy, and also makes sense according to standard IT operational norms, such as the principle of least privilege.

Given these constraints, strategies such as the creation of data aggregates or the use of more coarse-grained option sets, which result in an extended Ed-Fi API, are appropriate mechanisms to safeguard student data.

However, this does not mean that states are free to re-design the API or take whatever liberties they like with extensions, such as duplicating elements already defined in an Ed-Fi API, changing the semantics of elements, or requiring an aggregate value that they are capable of calculating.

Where states can use the Ed-Fi specifications, they can and must do so to protect the investment of SIS systems in their Ed-Fi implementations and in order to support the health of the overall Ed-Fi Community.

The School District Case

Unlike states, we believe that it is appropriate to ask that Ed-Fi school district data streams do converge into a single stream capable of serving the needs of every K12 school district.

In aggregate, K12 school districts are a lot more alike in terms of their mission, operations, and norms than they are different, and unlike states, school districts are the providers of direct educational services and so have good reason to maximize the use of the data to improve student performance.

School districts must, of course, do so while respecting that not everyone in the district has a right to see everything, so security controls are an essential part of such an architecture (and indeed a huge part of the Ed-Fi platform). But there is no reason to block a district’s ability to get access to any of their own data directly from source systems, in a machine-readable, standardized format.

While there are differences in how school districts operate, and while those differences do result in different data elements, the goal of the Ed-Fi Data Standard is to allow those differences to coexist within the same data schema. When community members feel like there is a gap, the Ed-Fi Governance Process is the community-based mechanism for the evolution of the data model and APIs into new areas.

The alternative to working together in an open process to align school district data needs around a universal set of core requirements from source system vendors is fragmentation, which ultimately increases costs for the source systems (who must support multiple API “flavors”), less reliability (these “flavors” dilute efforts to ensure quality), and reduced the ability of school agencies to collaborate (as their implementations now differ).

Responsibilities

Both states and school districts have roles to play in the Ed-Fi ecosystem. States must extend Ed-Fi technology and APIs responsibly, and school districts must prioritize defining collective needs and aggregating collective demand as they expand the scope of their use of standardized data exchange. And both states and school districts must work together, meeting at the governance table to discuss and plan on common paths into new data domains or alignment, where possible, of data definitions.

To follow the rules and resist customization will often mean locally doing things a bit less efficiently than desired, but that is part of the pain of collaboration. There is a saying that you need to “collaborate until it hurts” and it’s true. However, it is the collective small sacrifices of community members that make the big wins for the entire community possible. It’s what is going to make it possible for educators in every school district to make the best possible decisions for their students, informed by the seamless and secure flow of student data between education systems.

Next Up: