Import JSON into Postgres using COPY

json postgres ndjson sql jq

Loading data into a database is an essential feature that we use very frequently, most databases offer some kind of mechanism to import external data into tables.

Lucky us, Postgres has a very handy command called COPY. The COPY command can be used to copy contents from one table to another, or to copy data from a file into a table.

COPY is a very powerful tool and it’s blazing fast too! allowing us to import multi-GB files in seconds if done properly.

The COPY command allows us to import data from a variety of formats: CSV, TSV and JSON. In this article we’ll be focusing on JSON, since it can get tricky, depending on how your file is structured.

Importing JSON arrays

If your file is structured this way (note that the rows are inside an array):

[
    {"id":1,"first_name":"Kelly","last_name":"Freddi","email":"kfreddi0@wikipedia.org","gender":"Male","ip_address":"79.111.233.14"},
    {"id":2,"first_name":"Orazio","last_name":"Playhill","email":"oplayhill1@bbb.org","gender":"Male","ip_address":"94.111.107.131"},
    ...
]

You will have to do an extra step to transform the file into a Newline Delimited JSON or NDJSON, which should look like this:

{"id":1,"first_name":"Kelly","last_name":"Freddi","email":"kfreddi0@wikipedia.org","gender":"Male","ip_address":"79.111.233.14"}
{"id":2,"first_name":"Orazio","last_name":"Playhill","email":"oplayhill1@bbb.org","gender":"Male","ip_address":"94.111.107.131"}
...

Note how each JSON object is delimited by a new line (without commas) and we don’t include the array start [ and end ] characters.

To convert our file from normal JSON to a Newline Delimited JSON you could just use our converter which will do you it for you, or you can use a command line tool like jq:

$ jq -c '.[]' your_file.json > your_new_file.json

This simple command will take your JSON file in array format and convert it into a NDJSON file.

Now you can use the COPY command to import your data into Postgres.

Importing Newline Delimited JSON data

Use psql to connect to your database:

$ psql -h YOUR_HOST -U YOUR_USER -d YOUR_DATABASE

First, create a temporary table where you’ll import your data.

CREATE TABLE temp (data jsonb);

Once the table is created, let’s import our data.

Since COPY is a server-side command it won’t have access to the files in your disk, you can use the \COPY command instead which will do the same:

\COPY temp (data) FROM 'my_data.json';

This will import each JSON object into a row in the temporary table, now you can easily query the data like this:

SELECT data->>'id', data->>'first_name'
FROM temp;

  id  |  first_name  
------+--------------
 1    | Kelly
 2    | Orazio
 3    | Sabine
...

Moving the JSON data into a proper table

Now we have our JSON data but it’s all in a JSONB column, we might want to import it into a proper table to guarantee the types and structure, which is very recommended.

You can make use of INSERT INTO .. SELECT statement to do this easily, here’s an example (you can use an existing table to do this):

CREATE TABLE users (id SERIAL, first_name TEXT);

INSERT INTO users
SELECT (data->>'id')::integer, data->>'first_name'
FROM temp;

This will insert all the records into the table users and when running it Postgres will issue errors if there are type mismatches, which is very helpful to ensure data integrity.

Note: If you’re inserting a large amount of records, indexes and triggers in the destination table can greatly increase the time it takes to insert your data, you can temporarily disable them when inserting to improve the speed significantly.

Make sure you delete the temporary table once you’re done!

DROP TABLE temp;