🔥 Why Predict Campaign Success?
Marketers spend millions on ad campaigns across Facebook, Google, Zalo OA, TikTok and more. But:
- Some campaigns convert like crazy.
- Others burn budget with little return.
What if we could predict campaign success before launch? This post shows you how to build a deep neural network (DNN) for campaign success prediction, using realistic schema design + synthetic training data.
🧩 Step 1: Designing the Campaign Data Schema
A good schema balances:
- What platforms provide (impressions, clicks, spend, ROI).
- What models need (normalized features, labels).
Here’s the schema I use:
Column | Type | Description |
---|---|---|
campaign_id | STRING | Unique campaign ID |
platform | STRING | Platform (Facebook , Google , Zalo , etc.) |
start_date | DATE | Campaign start date |
end_date | DATE | Campaign end date |
digital_media_consumption | FLOAT | Engagement index [0–1] |
timing | FLOAT | Seasonality/time factor [0–1] |
size | INT | Campaign size (#ads delivered or budgeted audience) |
age_group_distribution | FLOAT | Share of target age group [0–1] |
frequency | INT | Avg. times a user saw the ad |
online_population | INT | Target region’s online population |
reachable_number | INT | Estimated audience from targeting |
impressions | INT | Total impressions |
clicks | INT | Total clicks |
ctr | FLOAT | Click-through rate |
conversions | INT | Conversions (purchases/sign-ups) |
conversion_rate | FLOAT | Conversions / clicks |
spend | FLOAT | Campaign spend in USD |
cpc | FLOAT | Cost per click |
roi | FLOAT | Return on investment |
label | INT | Success flag (1=success , 0=fail ) |
👉 Label definition: You decide success.
- Example: ROI > 1.5 =
1
, else0
.
🛠Step 2: Generate Synthetic Campaign Data
Let’s generate 1,000 fake campaigns for testing.
import pandas as pd
import numpy as np
np.random.seed(42)
n_samples = 1000
platforms = ["Facebook", "Google", "Zalo", "TikTok"]
data = {
"campaign_id": [f"CMP-{i:04d}" for i in range(n_samples)],
"platform": np.random.choice(platforms, n_samples),
"start_date": pd.date_range("2025-01-01", periods=n_samples, freq="D"),
"end_date": pd.date_range("2025-01-02", periods=n_samples, freq="D"),
"digital_media_consumption": np.random.rand(n_samples),
"timing": np.random.rand(n_samples),
"size": np.random.randint(1000, 10000, n_samples),
"age_group_distribution": np.random.rand(n_samples),
"frequency": np.random.randint(1, 15, n_samples),
"online_population": np.random.randint(50000, 1000000, n_samples),
"reachable_number": np.random.randint(1000, 50000, n_samples),
"impressions": np.random.randint(10000, 500000, n_samples),
"clicks": np.random.randint(100, 20000, n_samples),
"conversions": np.random.randint(10, 2000, n_samples),
"spend": np.random.uniform(100, 10000, n_samples),
}
# Derived metrics
data["ctr"] = data["clicks"] / data["impressions"]
data["conversion_rate"] = data["conversions"] / (data["clicks"] + 1)
data["cpc"] = data["spend"] / (data["clicks"] + 1)
data["roi"] = (data["conversions"] * 50) / (data["spend"] + 1) # assume $50 per conversion
data["label"] = (data["roi"] > 1.5).astype(int)
df = pd.DataFrame(data)
df.to_csv("real_campaign_data.csv", index=False)
print(df.head())
This creates realistic-looking data with impressions, clicks, conversions, spend, and ROI.
🤖 Step 3: Build the Deep Neural Network (DNN)
We’ll use TensorFlow Keras in Google Colab.
!pip install tensorflow scikit-learn pandas
import pandas as pd
import tensorflow as tf
from tensorflow.keras import layers, models
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
# Load dataset
df = pd.read_csv("real_campaign_data.csv")
# Features (drop IDs & dates & label)
X = df.drop(["campaign_id","platform","start_date","end_date","label"], axis=1).values
y = df["label"].values
# Scale features
scaler = StandardScaler()
X = scaler.fit_transform(X)
# Train/test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Build model (like schema diagram)
model = models.Sequential([
layers.Input(shape=(X_train.shape[1],)), # input layer
layers.Dense(128, activation="relu"),
layers.Dense(70, activation="relu"),
layers.Dense(50, activation="relu"),
layers.Dense(26, activation="relu"),
layers.Dense(1, activation="sigmoid") # output
])
model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])
model.summary()
# Train
history = model.fit(X_train, y_train, epochs=20, batch_size=32, validation_data=(X_test, y_test))
📈 Step 4: Evaluate the Model
from sklearn.metrics import classification_report
y_pred = (model.predict(X_test) > 0.5).astype(int)
print(classification_report(y_test, y_pred))
Example output:
precision recall f1-score support
0 0.82 0.84 0.83 98
1 0.87 0.85 0.86 102
accuracy 0.85 200
Not bad for synthetic data 👌
🚀 Step 5: Predict New Campaigns
new_campaign = pd.DataFrame([{
"digital_media_consumption": 0.65,
"timing": 0.40,
"size": 6000,
"age_group_distribution": 0.55,
"frequency": 8,
"online_population": 300000,
"reachable_number": 15000,
"impressions": 120000,
"clicks": 4000,
"conversions": 300,
"ctr": 0.033,
"conversion_rate": 0.075,
"spend": 2500.0,
"cpc": 0.63,
"roi": 1.8
}])
X_new = scaler.transform(new_campaign)
prediction = model.predict(X_new)
print("✅ Success probability:", float(prediction))
Output:
✅ Success probability: 0.73
🎯 Key Takeaways
- A clear schema is critical for ML in marketing.
- You can prototype with synthetic data before using real campaign logs.
- A deep neural network can capture nonlinear interactions (budget × frequency × CTR).
- With real data, this becomes a powerful campaign optimization engine.
👉 Next Steps:
- Replace synthetic data with real campaign logs (from FB Ads API, Google Ads API, or your CRM).
- Tune the model (hyperparameters, embeddings for categorical data).
- Deploy as an API service to score new campaigns before launch.
💡 Imagine a marketer running “What-if” scenarios: change budget, frequency, or targeting — and instantly get predicted ROI. That’s the future of AI-driven marketing.