The data dilemma: Building trust in simulation without perfect data

Simulation can deliver powerful insights—but only if stakeholders trust the data behind it. This blog explores how teams are navigating imperfect inputs to build confidence and drive smarter decisions.
Two engineers observe factory operations
Steve Jones

SteveJones

Steve is the Global Leading Professional for Simulation at Haskoning. He's experienced in helping clients understand how simulation and digital technology can support their business to make the right decisions, gain efficiency and de-risk changes.

If you don’t have confidence in the data feeding a simulation model, how can you be confident in the output?

Teams are eager to expand the use of simulation in their organisation, but can’t rely on their ability to access ‘perfect’ data. And the drive to build momentum then gets hampered by lack of stakeholder trust and buy-in.

That’s the data dilemma – which was a key topic at our recent Witness User Conference. Alexander Chirikadzi and Souryadeep Muzumder, DES Engineers at Jaguar Land Rover (JLR), led an interesting discussion on the issue. Let’s look at how JLR and others are tackling it.

Data issues affecting discrete event simulation

So much data is available from so many sources. Why is it suboptimal for use in simulation modelling? Issues fell into 4 broad categories:

  • Fractured and incomplete
    Getting a clean simulation result is challenging if data is incomplete. This could involve missing key logic like process steps, set-ups or AGV flows, for example. Challenges also arise from delayed data, like receiving layout updates after installation or decommissioning.

  • Lacking operational validation
    This could be because lines, equipment and/or technologies are new, meaning vendors don’t have historic validation for what they’re proposing. Whether it’s re-timing steps, tracking AGVs or observing buffers, on-site validation adds time and complexity to the modelling process.

  • Idealised assumptions
    Data issues arise when cycle times and breakdowns assume an ideal state, and when people assume constraints have been validated.

  • Inconsistent formats
    Whether it’s spreadsheets, presentations, screenshots or sketches, lack of standardisation is a major challenge. These myriad formats often need to be transformed before they can be used in simulation.

Achieving availability, quality and speed

This is the simulation data trinity – having fast access to high-quality data helps deliver trustworthy results. In practice, this means having access to data that ticks the following boxes:

  • Traceable and source-linked: So simulation teams can check that it’s validated and reliable
  • Complete process logic: For example, including scrap, rework, breakdowns, buffers and AGVs
  • Standardised formats: So it doesn’t need reprocessing or transforming
  • Plug and play ready: So it can feed directly into simulation with minimal rework. The ultimate aim is to integrate with live station data from manufacturing systems, so data comes directly from sensors.
Data for a simulation model organised in an Excel interface
An example of simulation model data organised in an Excel interface.

When perfect data isn't an option: A pragmatic approach

The fact is, you won’t always achieve that data trinity. Simulation teams need ways of working with what they have – while supporting internal customers and vendors to continuously improve traceability, completeness, standardisation and data readiness.

Here’s how JLR and other Witness User Conference attendees are driving this:

  • Leveraging simulation across the lifecycle
    Different lifecycle stages have different data requirements. Taking a strategic approach to developing use cases across the lifecycle helps strengthen trust and data enablement.

    For example, black box models at concept phase can help teams understand areas like buffer sizes and storage capacities. At the design phase, you can build more detailed models as layouts mature.

    Then, at installation and manufacture, you can feed more data into the chosen detailed model. This can provide ideas for addressing issues in areas like reworks and quality, fine-tuning investment and unblocking ramp-up issues. Finally, you get to operations. This is when you can start supplementing the model with live data, helping identify and address bottlenecks that weren’t apparent in the install phase, and informing investment decisions for ongoing improvement.

    You therefore create a simulation cycle where you’re able to validate and enhance data inputs as you create momentum around the value of simulation.

  • Engaging stakeholders with model scoping
    Stakeholder alignment is key to overcoming the data dilemma. Often, the 4 data issues arise when stakeholders aren’t bought in or don’t fully understand what the model is trying to achieve. A robust model scoping process mitigates this by helping people across teams understand and prioritise their challenges – and understand how those challenges can be addressed in the modelling. This common understanding then dictates the data needed to realise the agreed objectives.

  • Validating early
    The lifecycle stage and model scope will dictate the data requirements. But it’s crucial that assumptions are discussed openly. The default should be to mistrust base assumptions, and to validate them with actual data where possible. If validation isn’t feasible, ensure stakeholders agree with the assumptions to maintain their trust in the outputs.

  • Building light, testing often
    This ties into validating assumptions. Building a minimum viable product with assumptions gives customers an idea of what the model looks like and where it’s going. You can then make it progressively more representative. That way, customers aren’t waiting the length of the modelling project to see results. Instead, they have regular check-ins to feed back on what works and what needs refinement.

    Using standard model build templates helps here. They give you speed and consistency, enabling an agile build-test-refine cycle.

  • Visualising results with the customer in mind
    It’s easier for stakeholders to see the value of the model (and the value of improving their data inputs) when the outputs are meaningful. It depends on the stakeholder, but often visualisations are more effective than statistical reports. Whether it’s 3D, spreadsheets or Power BI dashboards, contextualise results with a relevant and familiar interface. Knowing they can get access to a new pie chart or metric by improving data provision can be a strong motivator for stakeholders.

Deliver confidence despite data constraints

The data dilemma doesn't have to be a roadblock to successful simulation. The key is to take a pragmatic approach: using the reliable data you have for the appropriate model type, and thinking about how you can validate as you go – using simulation to continuously improve data reliability as you build a model ecosystem

This then helps position the simulation team as a true capability provider instead of a purveyor of black arts who stakeholders can ignore if they disagree with model results.

If you're facing similar data challenges in your simulation projects and would like support in implementing these strategies, we're here to help. Contact Ben Lomax Thorpe, Haskoning’s Simulation Advisory Group Commercial Director, below.

Ben Lomax Thorpe - Commercial Director - Simulation

Ben LomaxThorpe

Commercial Director - Simulation