AI_FILL
The AI_FILL function intelligently completes incomplete or missing data by analyzing patterns in provided example data and applying those patterns to fill gaps. Rather than simple interpolation or default value assignment, this function leverages large language models (LLMs) to understand contextual patterns and generate realistic, contextually appropriate values for missing cells. By default, it uses the Mistral AI platform, though any OpenAI-compatible API endpoint can be configured.
The function employs few-shot learning—a technique where the LLM is provided with complete example data to learn patterns before being asked to fill missing values. The model analyzes relationships between columns, typical value ranges, and contextual associations, then applies this understanding to complete the target data. This approach is particularly effective for tabular data where values have semantic meaning and dependencies on other columns.
For example, given employee data with several complete records, AI_FILL can infer appropriate departments or locations for new employees based on their job titles. With product data, it can deduce prices from categories and similar items. The function uses JSON mode (response_format: json_object) to ensure structured, parseable responses from the Mistral Chat Completions API.
The temperature parameter (default 0, range 0.0–2.0) controls the creativity/consistency trade-off. Setting temperature=0 produces deterministic, reproducible results ideal for business data, while higher values introduce variation. According to Mistral’s documentation, temperatures between 0.0 and 0.7 are recommended for most use cases. The max_tokens parameter (default 1500, range 5–5000) controls response size and API costs.
This example function is provided as-is without any representation of accuracy.
Excel Usage
=AI_FILL(example_range, fill_range, api_key, temperature, model, max_tokens, api_url)
example_range(list[list], required): 2D list of complete example data to learn patterns fromfill_range(list[list], required): 2D list of target data with missing values (null) to fillapi_key(str, required): API key for authentication.temperature(float, optional, default: 0): Controls randomness in AI response (0.0 = deterministic, 2.0 = highly random)model(str, optional, default: “codestral-2508”): Model ID to use. Default is “codestral-2508”.max_tokens(int, optional, default: 1500): Maximum tokens in the AI response (5 to 5000)api_url(str, optional, default: “https://api.mistral.ai/v1/chat/completions”): OpenAI-compatible API endpoint URL. Default is “https://api.mistral.ai/v1/chat/completions”.
Returns (list[list]): 2D list with filled values, or error message string.
Example 1: Demo case 1
Inputs:
| example_range | fill_range | temperature | model | max_tokens | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Product ID | Category | Price | Weight (kg) | Product ID | Category | Price | Weight (kg) | 0 | codestral-2508 | 1500 |
| PRD-001 | Laptop | 1299 | 1.8 | PRD-004 | Laptop | |||||
| PRD-002 | Laptop | 999 | 2.1 | PRD-005 | Tablet | |||||
| PRD-003 | Tablet | 499 | 0.7 | PRD-006 | 799 | 1.2 |
Excel formula:
=AI_FILL({"Product ID","Category","Price","Weight (kg)";"PRD-001","Laptop",1299,1.8;"PRD-002","Laptop",999,2.1;"PRD-003","Tablet",499,0.7}, {"Product ID","Category","Price","Weight (kg)";"PRD-004","Laptop","","";"PRD-005","Tablet","","";"PRD-006","",799,1.2}, 0, "codestral-2508", 1500)
Expected output:
| Product ID | Category | Price | Weight (kg) |
|---|---|---|---|
| PRD-004 | Laptop | 1299 | 1.8 |
| PRD-005 | Tablet | 499 | 0.7 |
| PRD-006 | Tablet | 799 | 1.2 |
Example 2: Demo case 2
Inputs:
| example_range | fill_range | temperature | model | max_tokens | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Employee ID | Job Title | Department | Location | Employee ID | Job Title | Department | Location | 0 | codestral-2508 | 1500 |
| EMP-001 | Sales Manager | Sales | New York | EMP-005 | Sales Director | |||||
| EMP-002 | Marketing Specialist | Marketing | Chicago | EMP-006 | UX Designer | |||||
| EMP-003 | Sales Representative | Sales | Los Angeles | EMP-007 | Marketing Director | |||||
| EMP-004 | Software Developer | Engineering | San Francisco | EMP-008 | Senior Developer |
Excel formula:
=AI_FILL({"Employee ID","Job Title","Department","Location";"EMP-001","Sales Manager","Sales","New York";"EMP-002","Marketing Specialist","Marketing","Chicago";"EMP-003","Sales Representative","Sales","Los Angeles";"EMP-004","Software Developer","Engineering","San Francisco"}, {"Employee ID","Job Title","Department","Location";"EMP-005","Sales Director","","";"EMP-006","UX Designer","","";"EMP-007","Marketing Director","","";"EMP-008","Senior Developer","",""}, 0, "codestral-2508", 1500)
Expected output:
| Employee ID | Job Title | Department | Location |
|---|---|---|---|
| EMP-005 | Sales Director | Sales | New York |
| EMP-006 | UX Designer | Engineering | San Francisco |
| EMP-007 | Marketing Director | Marketing | Chicago |
| EMP-008 | Senior Developer | Engineering | San Francisco |
Example 3: Demo case 3
Inputs:
| example_range | fill_range | temperature | model | max_tokens | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Metric | Q1 2024 | Q2 2024 | Q3 2024 | Q4 2024 | Metric | Q1 2024 | Q2 2024 | Q3 2024 | Q4 2024 | 0 | codestral-2508 | 1500 |
| Revenue | 250000 | 280000 | 310000 | 350000 | Profit | |||||||
| Expenses | 180000 | 195000 | 215000 | 235000 | Headcount | 32 | 35 |
Excel formula:
=AI_FILL({"Metric","Q1 2024","Q2 2024","Q3 2024","Q4 2024";"Revenue",250000,280000,310000,350000;"Expenses",180000,195000,215000,235000}, {"Metric","Q1 2024","Q2 2024","Q3 2024","Q4 2024";"Profit","","","","";"Headcount",32,35,"",""}, 0, "codestral-2508", 1500)
Expected output:
| Metric | Q1 2024 | Q2 2024 | Q3 2024 | Q4 2024 |
|---|---|---|---|---|
| Profit | 70000 | 85000 | 95000 | 115000 |
| Headcount | 32 | 35 | 38 | 41 |
Example 4: Demo case 4
Inputs:
| example_range | fill_range | ||||||
|---|---|---|---|---|---|---|---|
| Product ID | Category | Price | Weight (kg) | Product ID | Category | Price | Weight (kg) |
| PRD-001 | Laptop | 1299 | 1.8 | PRD-007 | Laptop | ||
| PRD-002 | Laptop | 999 | 2.1 | ||||
| PRD-003 | Tablet | 499 | 0.7 |
Excel formula:
=AI_FILL({"Product ID","Category","Price","Weight (kg)";"PRD-001","Laptop",1299,1.8;"PRD-002","Laptop",999,2.1;"PRD-003","Tablet",499,0.7}, {"Product ID","Category","Price","Weight (kg)";"PRD-007","Laptop","",""})
Expected output:
| Product ID | Category | Price | Weight (kg) |
|---|---|---|---|
| PRD-007 | Laptop | 1299 | 1.8 |
Python Code
Show Code
import requests
import json
def ai_fill(example_range, fill_range, api_key, temperature=0, model='codestral-2508', max_tokens=1500, api_url='https://api.mistral.ai/v1/chat/completions'):
"""
Fills missing or incomplete data in a target range by learning patterns from an example range using a large language model.
This example function is provided as-is without any representation of accuracy.
Args:
example_range (list[list]): 2D list of complete example data to learn patterns from
fill_range (list[list]): 2D list of target data with missing values (null) to fill
api_key (str): API key for authentication.
temperature (float, optional): Controls randomness in AI response (0.0 = deterministic, 2.0 = highly random) Default is 0.
model (str, optional): Model ID to use. Default is "codestral-2508". Default is 'codestral-2508'.
max_tokens (int, optional): Maximum tokens in the AI response (5 to 5000) Default is 1500.
api_url (str, optional): OpenAI-compatible API endpoint URL. Default is "https://api.mistral.ai/v1/chat/completions". Default is 'https://api.mistral.ai/v1/chat/completions'.
Returns:
list[list]: 2D list with filled values, or error message string.
"""
# Validate temperature
if not isinstance(temperature, (float, int)) or not (0 <= float(temperature) <= 2):
return "Error: temperature must be a float between 0 and 2 (inclusive)"
# Validate max_tokens
if not isinstance(max_tokens, int) or not (5 <= max_tokens <= 5000):
return "Error: max_tokens must be an integer between 5 and 5000 (inclusive)"
# Validate inputs
if not isinstance(example_range, list) or not example_range:
return "Error: Example range is empty or invalid"
if not isinstance(fill_range, list) or not fill_range:
return "Error: Fill range is empty or invalid"
if not api_key:
return "You must include an API key to use this function. Sign up for a free API key at https://aistudio.google.com/, https://console.mistral.ai/, or other providers and add your own api_key. You may use any OpenAI compatible API, just update the api_url parameter."
# Convert example_range and fill_range to JSON strings for the prompt
example_json = json.dumps(example_range)
fill_json = json.dumps(fill_range)
# Construct a specific prompt for filling data
fill_prompt = (
"Fill in the missing values in the target data based on patterns in the example data.\n"
f"Example data (complete): {example_json}\n\n"
f"Target data (with missing values): {fill_json}\n\n"
"Study the patterns in the example data and complete the target data by filling in missing values. Preserve all existing values in the target data."
)
fill_prompt += ("\n\nReturn ONLY a JSON object with a key 'items' whose value is a JSON array of arrays (2D array) with the completed target data. "
"Do not include any explanatory text, just the JSON object. "
"For example: {\"items\": [[\"row1col1\", \"row1col2\"], [\"row2col1\", \"row2col2\"]]}")
# Prepare the API request payload
payload = {
"messages": [{"role": "user", "content": fill_prompt}],
"temperature": temperature,
"model": model,
"max_tokens": max_tokens,
"response_format": {"type": "json_object"}
}
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
"Accept": "application/json"
}
try:
response = requests.post(api_url, headers=headers, json=payload)
if response.status_code == 429:
return "Error: You have hit the rate limit for the API. Please try again later"
response.raise_for_status()
response_data = response.json()
content = response_data["choices"][0]["message"]["content"]
try:
filled_data = json.loads(content)
if isinstance(filled_data, dict) and "items" in filled_data:
filled_data = filled_data["items"]
elif isinstance(filled_data, dict):
if "data" in filled_data:
filled_data = filled_data["data"]
elif "filled_data" in filled_data:
filled_data = filled_data["filled_data"]
elif "result" in filled_data:
filled_data = filled_data["result"]
if isinstance(filled_data, list) and all(isinstance(row, list) for row in filled_data):
if (len(filled_data) == len(fill_range) and all(len(row) == len(fill_range[i]) for i, row in enumerate(filled_data))):
return filled_data
else:
return "Error: AI response dimensions don't match the fill range"
else:
return "Error: Unable to parse response. Expected a 2D array"
except (json.JSONDecodeError, ValueError):
return "Error: Unable to fill data. The AI response wasn't in the expected format"
except requests.exceptions.RequestException as e:
return f"Error: API request failed. {str(e)}"