Technical
Building a Content Pipeline with Python Scripts
Publishing 365 articles by hand through an admin UI would take weeks of tedious clicking. Instead, I built a content pipeline using Python scripts that batch-create articles via API. Here is how to build your own content automation.
Why Scripts Beat Manual Entry
When you need to create more than a handful of content items, manual entry fails:
- It is slow (copy-paste, fill forms, click publish, repeat)
- It is error-prone (typos in slugs, wrong categories, missing fields)
- It is not reproducible (if the database resets, you do it all again)
- It is not versionable (no git history of what was published)
Scripts solve all four problems.
The Pattern
Every content creation script follows the same structure:
#!/usr/bin/env python3
import json
import urllib.request
API_URL = 'http://localhost:8321/posts'
ARTICLES = [
{
'title': 'Article Title',
'slug': 'article-title',
'body': 'Markdown content here...',
'excerpt': 'Brief summary',
'categories': ['Technical', 'Python'],
'status': 'published',
'publishedAt': '2025-05-01T10:00:00Z',
},
]
def post_article(payload):
data = json.dumps(payload).encode('utf-8')
req = urllib.request.Request(
API_URL, data=data,
headers={'Content-Type': 'application/json'},
method='POST',
)
with urllib.request.urlopen(req) as resp:
return json.loads(resp.read())Why urllib Instead of requests
I use urllib.request from the standard library instead of the popular requests package. The reason is simple: zero dependencies. The script runs on any machine with Python 3 installed. No virtual environment, no pip install, no dependency conflicts.
For scripts that run once and create data, simplicity beats convenience.
Error Handling for Idempotent Scripts
Good content scripts are idempotent: you can run them multiple times without creating duplicates. Handle the 409 Conflict response that your API returns for duplicate slugs:
try:
with urllib.request.urlopen(req) as resp:
print(f'CREATED: {payload["slug"]}')
except urllib.error.HTTPError as e:
if e.code == 409:
print(f'EXISTS: {payload["slug"]}')
else:
print(f'ERROR {e.code}: {e.read().decode()}')Version Control Your Content
Save your scripts in a .planning/artifacts/ directory and commit them to git. This gives you a reproducible content pipeline. If the database resets, run the scripts again. If you need to review what was published, read the scripts.
This is especially valuable during development with in-memory databases like DynamoDB Local. Every container restart wipes the data. The scripts recreate it in seconds.
Scaling the Pipeline
For 365 articles, I split scripts by month (25-31 articles each). Each script is independent and can run in any order. This keeps individual files readable and makes it easy to debug issues with specific date ranges.
Build your content as code. Version it. Automate it. Never manually enter 365 articles.
See the Python urllib documentation for the standard library HTTP client reference.
RELATED READING
The Consulting Shift I Am Making In Year Two
After a year of writing and building, my consulting practice is changing shape. Shorter engagements. Sharper outcomes.
ReadThe Frontend Shift: Shipping Less JavaScript In Year Two
A year ago I reached for Next.js for everything. This year I often reach for nothing.
ReadThe Serverless Lesson I Would Write On A Sticky Note
After a year of shipping serverless projects, one rule explains most of the wins and all of the losses.
Read