Feature/docs generator (#11858)

### Type of change

- [x] New Feature (non-breaking change which adds functionality)


### What problem does this PR solve?

This PR introduces a new Docs Generator agent component for producing
downloadable PDF, DOCX, or TXT files from Markdown content generated
within a RAGFlow workflow.

### **Key Features**

**Backend**

- New component: DocsGenerator (agent/component/docs_generator.py)
- 
- Markdown → PDF/DOCX/TXT conversion
- 
- Supports tables, lists, code blocks, headings, and rich formatting
- 
- Configurable document style (fonts, margins, colors, page size,
orientation)
- 
- Optional header logo and footer with page numbers/timestamps
- 

**Frontend**

- New configuration UI for the Docs Generator
- 
- Download button integrated into the chat interface
- 
- Output wired to the Message component
- 
- Full i18n support

**Documentation**

Added component guide:
docs/guides/agent/agent_component_reference/docs_generator.md

**Usage**

Add the Docs Generator to a workflow, connect Markdown output from an
upstream component, configure metadata/style, and feed its output into
the Message component. Users will see a document download button
directly in the chat.

**Contributor Note**

We have been following RAGFlow since more than a year and half now and
have worked extensively on personalizing the framework and integrating
it into several of our internal systems. Over the past year and a half,
we have built multiple platforms that rely on RAGFlow as a core
component, which has given us a strong appreciation for how flexible and
powerful the project is.

We also previously contributed the full Italian translation, and we were
glad to see it accepted. This new Docs Generator component was created
for our own production needs, and we believe that it may be useful for
many others in the community as well.

We want to sincerely thank the entire RAGFlow team for the remarkable
work you have done and continue to do. If there are opportunities to
contribute further, we would be glad to help whenever we have time
available. It would be a pleasure to support the project in any way we
can.

If appropriate, we would be glad to be listed among the project’s
contributors, but in any case we look forward to continuing to support
and contribute to the project.

PentaFrame Development Team

---------

Co-authored-by: PentaFrame <info@pentaframe.it>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
This commit is contained in:
PentaFDevs
2025-12-12 07:59:43 +01:00
committed by GitHub
parent 6560388f2b
commit f9510edbbc
29 changed files with 3043 additions and 102 deletions

View File

@ -4,6 +4,7 @@ import {
IMessage,
IReferenceChunk,
IReferenceObject,
UploadResponseDataType,
} from '@/interfaces/database/chat';
import classNames from 'classnames';
import {
@ -24,6 +25,11 @@ import { WorkFlowTimeline } from '@/pages/agent/log-sheet/workflow-timeline';
import { isEmpty } from 'lodash';
import { Atom, ChevronDown, ChevronUp } from 'lucide-react';
import MarkdownContent from '../next-markdown-content';
import {
PDFDownloadButton,
extractPDFDownloadInfo,
removePDFDownloadInfo,
} from '../pdf-download-button';
import { RAGFlowAvatar } from '../ragflow-avatar';
import { useTheme } from '../theme-provider';
import { Button } from '../ui/button';
@ -95,6 +101,20 @@ function MessageItem({
return Object.values(docs);
}, [reference?.doc_aggs]);
// Extract PDF download info from message content
const pdfDownloadInfo = useMemo(
() => extractPDFDownloadInfo(item.content),
[item.content],
);
// If we have PDF download info, extract the remaining text
const messageContent = useMemo(() => {
if (!pdfDownloadInfo) return item.content;
// Remove the JSON part from the content to avoid showing it
return removePDFDownloadInfo(item.content, pdfDownloadInfo);
}, [item.content, pdfDownloadInfo]);
const handleRegenerateMessage = useCallback(() => {
regenerateMessage?.(item);
}, [regenerateMessage, item]);
@ -219,28 +239,39 @@ function MessageItem({
/>
</div>
)}
<div
className={cn({
[theme === 'dark'
? styles.messageTextDark
: styles.messageText]: isAssistant,
[styles.messageUserText]: !isAssistant,
'bg-bg-card': !isAssistant,
})}
>
{item.data ? (
children
) : sendLoading && isEmpty(item.content) ? (
<>{!isShare && 'running...'}</>
) : (
<MarkdownContent
loading={loading}
content={item.content}
reference={reference}
clickDocumentButton={clickDocumentButton}
></MarkdownContent>
)}
</div>
{/* Show PDF download button if download info is present */}
{pdfDownloadInfo && (
<PDFDownloadButton
downloadInfo={pdfDownloadInfo}
className="mb-2"
/>
)}
{/* Show message content if there's any text besides the download */}
{messageContent && (
<div
className={cn({
[theme === 'dark'
? styles.messageTextDark
: styles.messageText]: isAssistant,
[styles.messageUserText]: !isAssistant,
'bg-bg-card': !isAssistant,
})}
>
{item.data ? (
children
) : sendLoading && isEmpty(messageContent) ? (
<>{!isShare && 'running...'}</>
) : (
<MarkdownContent
loading={loading}
content={messageContent}
reference={reference}
clickDocumentButton={clickDocumentButton}
></MarkdownContent>
)}
</div>
)}
{isAssistant && referenceDocuments.length > 0 && (
<ReferenceDocumentList
list={referenceDocuments}
@ -248,7 +279,9 @@ function MessageItem({
)}
{isUser && (
<UploadedMessageFiles files={item.files}></UploadedMessageFiles>
<UploadedMessageFiles
files={item.files as File[] | UploadResponseDataType[]}
></UploadedMessageFiles>
)}
{/* {isAssistant && item.attachment && item.attachment.doc_id && (
<div className="w-full flex items-center justify-end">