AI Realtor: Grounded Persuasive Language Generation for Automated Marketing

Jibang Wu1,*, Chenghao Yang1,*, Simon Mahns1, Chaoqi Wang1, Hao Zhu2, Fei Fang3, Haifeng Xu1
1University of Chicago, 2Stanford University, 3Carnegie Mellon University
*Equal contribution

Abstract

This paper develops an agentic framework that employs large language models (LLMs) to automate the generation of persuasive and grounded marketing content, using real estate listing descriptions as the focal application domain. The method is designed to align the generated content with user preferences while highlighting useful factual attributes. The agent consists of three key modules: (1) Grounding Module, mimicking expert human behavior to predict marketable features; (2) Personalization Module, aligning content with user preferences; (3) Marketing Module, ensuring factual accuracy and the inclusion of localized features. Human-subject experiments demonstrate that marketing descriptions generated by our approach are preferred over those written by human experts by a clear margin, suggesting a promising LLM-based agentic framework to automate large-scale targeted marketing while ensuring responsible generation using only facts.

Introduction

While large language models (LLMs) have made significant strides across various tasks, their ability to persuade remains an underexplored frontier. This capability is particularly important since persuasion-related economic activities underpin roughly $30\%$ of the US GDP, creating tremendous opportunities for applying LLMs across various sectors.

This research studies language generation for grounded persuasion — a form of persuasion inspired by Aristotle's philosophy that is grounded in fact, tailored to the audience, and adapted to contextual factors. Using real estate marketing as a testbed, we construct a realistic evaluation environment to measure the persuasiveness of preference-based generation.

"The faculty of observing, in any given case, the available means of persuasion." — Aristotle, Rhetoric.

Methodology: AI Realtor Framework

Figure 1: Illustration of the Design Pipeline of AI Realtor

The AI Realtor framework automates the generation of persuasive marketing content through three key modules:

Grounding Module

This module predicts credible features for marketing by learning the attribute-feature mapping. It transforms raw factual attributes (e.g., square footage, floor number) into marketable features (e.g., "spacious layout", "bright room") by learning from human-written descriptions.

We constructed a high-quality feature schema and labeled dataset via a human-LLM collaborated method (shown in the figure below) to train an LLM that achieves $69.39\%$ testing accuracy in predicting which features should be highlighted based on a property's attributes.

Schema Induction Pipeline
Figure 2(a): Illustration of the inductive feature schema construction pipeline (click to enlarge)

Personalization Module

This module aligns content with user preferences by eliciting user preferences and adjusting feature selection accordingly. Instead of relying solely on machine learning, it uses a scoring approach that balances feature validity with user preference ratings to select the most appropriate features to emphasize for each individual user. The user elicitation interface is shown below.

Marketing Module

Inspired by marketing research showing that buyers derive entertainment utility from surprising effects, this module identifies and highlights features that are relatively rare compared to surrounding properties. It determines "surprising features" based on their percentile in the feature distribution, giving LLMs localized feature information through Retrieval Augmented Generation (RAG). We presented one user feedback on the surprising effect below. Personal Identification Information has been redacted.

...AI Realtor specifically points out the rarity of the ample storage and built-in cabinetry in similarly priced listings, making the property stand out.

RAG-Empowered AI Realtor compares similar listings to identify surprising features.
Figure 2(c): RAG-Empowered AI Realtor compares similar listings to identify surprising features.

Theoretical Framework

The research is underpinned by a micro-economic framework for automated marketing based on strategic communication theory. Check out our paper for more details. Key components include:

Evaluation and Results

Human Feedback Evaluation

Figure 3: Comparison of model performance using Elo ratings

Systematic evaluation through human feedback shows that AI Realtor clearly outperforms both human experts and other model variants, measured by standard Elo ratings and win rates:

The human subjects in the experiment were asked to compare pairs of descriptions for the same property without knowing which was AI-generated and which was human-written. Each participant rated which description would make them more interested in the property, and by how much. We build a ChatArena-like interface to collect human feedback.

Schema Induction Pipeline
Figure 4: Illustration of the inductive feature schema construction pipeline (click to enlarge)

AI-Simulated Evaluation

To explore the potential of scaling evaluation through AI simulation, the researchers employed an LLM to simulate the responses of buyers from previous experiments:

Key findings from the AI simulation experiment:

Hallucination Checks via Fact-Checking

Figure 6: Faithfulness Scores for Different Models in Hallucination Checks

To ensure the generated content remained factually accurate, we conducted fine-grained fact-checking tests. Specifically:

We noted that it's debatable whether such vague descriptions constitute true hallucination, though some buyers did complain about this kind of language in their responses.

Key Findings

Applications and Impact

This research has significant implications for:

Ethics Consideration

From an ethical standpoint, we recognize the potential risks of deploying persuasive language agents, particularly regarding LLM hallucinations and misinformation. To address this, we conduct a fine-grained fact-checking analysis and find no substantial hallucination risks in our designed agents. However, we acknowledge that this remains an open challenge and encourage further investigations into mitigating potential unintended consequences.

We have obtained IRB approval (exempt) for our data collection and annotation. The Zillow-based source data used in our study are publicly available and processed to remove identifiable information. Additionally, we will release our codes and annotation data (subject to IRB requirements and annotator agreements) to foster continued research in this area.