Merging different JSON data structures into single array with JavaScript

I’m working with multiple data sources that return JSON in different formats. I need to combine all this data into one unified array but I’m stuck on the implementation.

First data source returns:

[
  {productId: "101", category: "electronics", timestamp: "1609459200"},
  {productId: "205", category: "electronics", timestamp: "1609545600"}
]

Second data source (Twitter API) returns:

[
  {tweet_id: "987654321098", posted_at: "1609459200"},
  {tweet_id: "123456789012", posted_at: "1609545600"}
]

I need to transform the Twitter data so it matches the first structure. Specifically, I want to add a ‘category’ field and rename ‘posted_at’ to ‘timestamp’. Then combine everything into a single array that I can sort by the timestamp field.

What’s the best approach to handle this data transformation and merging in JavaScript?

Use map() to transform the Twitter data, then concat both arrays. Try twitterData.map(item => ({productId: item.tweet_id, category: 'social', timestamp: item.posted_at})) then merge with the spread operator and sort by timestamp. Works great for me.

The JavaScript solutions mentioned work fine for small datasets, but they break down when you scale up or add more data sources.

I’ve built systems pulling from dozens of APIs with totally different schemas. Writing and maintaining transformation code for each one is a nightmare. Every time an API changes a field name or adds properties, someone has to dig into the code and debug.

What actually works is automated data pipelines that handle transformation and merging without custom JavaScript. You define your target schema once, then visually map each data source to match it.

For your case, you’d create a workflow connecting to both APIs, transform the Twitter data to match your product schema, merge everything into one array, and sort by timestamp. When Twitter changes their API or you add new sources, just update the mappings instead of rewriting code.

I’ve replaced tons of brittle JavaScript transformation logic with automated workflows. Way more reliable and easier to maintain when requirements change.

Latenode handles these data transformation pipelines perfectly: https://latenode.com

The Problem:

You’re struggling to combine data from multiple JSON sources with differing structures into a single, unified array in JavaScript. Specifically, you need to transform data from a Twitter API to match the structure of another data source, and then merge and sort the combined array by timestamp.

:thinking: Understanding the “Why” (The Root Cause):

Manually transforming and merging JSON data from multiple sources is error-prone and becomes increasingly difficult to maintain as the number of sources grows or their schemas change. Hardcoding transformations in JavaScript creates brittle code that requires significant refactoring whenever an API changes its response structure. This approach lacks scalability and maintainability. A more robust solution utilizes a structured approach to data transformation and pipeline management before merging the datasets.

:gear: Step-by-Step Guide:

Step 1: Normalize Data Sources Separately.

Before merging your datasets, create separate transformation functions to normalize each source’s data into a consistent structure. This makes the merging process significantly simpler and more robust. This example uses destructuring and concise mapping for efficient transformation.

First, define your target schema. Let’s say it’s: { productId: string, category: string, timestamp: number }.

Next, create a transformation function for each data source to match this schema. For your example:

const transformSource1 = (data) => data; // Source 1 already matches the target schema

const transformTwitterData = (data) => data.map(({ tweet_id: productId, posted_at: timestamp }) => ({
  productId,
  category: 'social',
  timestamp: parseInt(timestamp), //Convert timestamp to number for reliable sorting
}));

Step 2: Merge and Sort the Normalized Data.

Once each data source is transformed to match the target schema, merge them using the spread syntax and sort by the timestamp field.

const transformedSource1 = transformSource1( /* your source 1 data here */ );
const transformedTwitter = transformTwitterData( /* your Twitter API data here */ );

const combinedData = [...transformedSource1, ...transformedTwitter].sort((a, b) => a.timestamp - b.timestamp);

console.log(combinedData);

Step 3: Handle Errors and Data Validation (Optional but Recommended).

Add error handling and data validation within your transformation functions to gracefully handle unexpected data formats or missing fields. This prevents unexpected behavior or crashes when dealing with inconsistent or incomplete data from APIs. Consider adding checks to ensure productId and timestamp exist and are of the correct type before merging.

:mag: Common Pitfalls & What to Check Next:

  • Data Type Mismatches: Ensure all timestamp values are numbers (not strings) for reliable sorting. Use parseInt() or similar methods to convert strings to numbers if necessary.
  • Missing Fields: Your transformation functions should gracefully handle cases where a field might be missing from a data source. Consider using default values or error handling for robustness.
  • API Changes: Be prepared for APIs to change their response formats. Design your transformation functions to be flexible and easily adaptable to schema changes. Consider using a more robust solution for handling data integration from multiple sources should the complexity grow.

:speech_balloon: Still running into issues? Share your (sanitized) config files, the exact command you ran, and any other relevant details. The community is here to help!

Hit this all the time building dashboards. Don’t merge first then transform - do it the other way around. Set up a standard interface that each data source follows before you combine anything. I always define my base schema first, then write adapter functions for each API. Like const adaptTwitterData = (data) => data.map(item => ({ productId: item.tweet_id, category: 'social_media', timestamp: item.posted_at })). Then just const combined = [...adaptedSource1, ...adaptedSource2].sort((a, b) => a.timestamp - b.timestamp). This saves you when you need validation or error handling later. You catch bad data at the adapter level instead of hunting down weird merge bugs. Also use timestamps as numbers, not strings - way more reliable for sorting across different date formats.

I hit the same issue with messy API responses. You need a transformation pipeline that normalizes schemas before merging anything. Here’s what works: transform each data source separately first. For Twitter data, use destructuring and property mapping: const normalizedTwitter = twitterData.map(({tweet_id: productId, posted_at: timestamp}) => ({productId, category: 'social', timestamp})). Way cleaner than assigning properties one by one. Then merge and sort: [...firstSource, ...normalizedTwitter].sort((a, b) => parseInt(a.timestamp) - parseInt(b.timestamp)). Best part? It’s super extensible. When you add more APIs later, just create new transformation functions with the same interface. I’ve used this pattern with 8+ different APIs and never had to refactor the core logic.

This topic was automatically closed 6 hours after the last reply. New replies are no longer allowed.