Skip to content

Latest commit

 

History

History
515 lines (433 loc) · 28.1 KB

1ContentCreationDialog.md

File metadata and controls

515 lines (433 loc) · 28.1 KB

Feature specification of the Content Creation Dialog

Basic idea

A page in Composum Pages consists of various components where many of them contain text attributes. For instance a teaser for another page (containing a title, subtitle and a text) or a text component that has a subtitle and a text, or a section component that contains a title and other components. We want to create a content creation dialog that can be opened from any of the attribute textfields or textareas or richtext editors in the component dialog and support creating content for that attribute, which will replace the attribute content if the user presses "Replace" in the dialog
or be appended to the attribute content when the user presses "Append" on the dialog. It should provide a good balance of being flexible to use, and not being too complicated.

The content creation itself is to be implemented using the ChatGPT completion API. The user can specify ChatGPT prompts in various ways, and include existing text from the page as additional input for ChatGPT, so that ChatGPT can be used to create content from a prompt, extend content, summarize content, excerpt content, make suggestions to improve the text, title generation and more.

Basic implementation decisions

  • The content creation dialog should contain all necessary elements from the beginning - it should not be a flow of several dialogs, but a single dialog the user works in until they are satisfied.
  • For textareas and richtext there is an optional drop down list to give an indication of the wanted text length with some rough indication of the desired text length. For textfields (a single line of content) that is absent. Options for this drop down list would be: "one line", "one sentence", "one paragraph", "several paragraphs".
  • For now we will just request one variant at a time from ChatGPT.
  • For selecting the content to include as additional input for ChatGPT, a drop down list with the following options: it could include the current text of the edited attribute, the text of the edited component (and subcomponents, if applicable) and the full text of the whole page. This is the only way to provide existing content.
  • We will make a history for the field that has the ChatGPT suggestion, as the user will very likely want to switch back and forth between the texts it generated and take the best one. For that we will need back and forth buttons. The history is cleared on each new dialog start.
  • The dialog has to be resizeable to work with large amounts of text.
  • There should be a loading indicator that shows whether we are currently waiting for a ChatGPT response. Stati: idle (= used also when done), processing.
  • We will not allow the user to edit the ChatGPT response as this can be edited in the text field.
  • For better usability, tooltips or help texts could be added to explain each feature of the dialog.

Out of scope

We will currently not include 'temperature' and 'max tokens' settings. The wanted text length can be specified by the user in the prompt. We don't include a history and also no undo feature.

User Workflow

To support the dialog design let's see some typical user workflows. Here are some likely use cases for the feature:

  1. Content Generation: The user wants to create new content for a blank text field. They open the content creation dialog, write a prompt in the prompt field and select the desired text length from the dropdown menu. They then press "Replace" and the generated content is added to the text field.
  2. Title Generation: The user wants to create a title for a section or page based on the content of that section or page. They can open the content creation dialog on the title textfield, select the option to include the text of the edited component or the full text of the page, select "summary" the prompt, and then press "Replace". The generated title replaces the existing title.
  3. Content Summary: The user wants to create a summarized version of a longer text, for example, for a title, subtitle, or summary paragraph. They open the content creation dialog, select the content to summarize (e.g. the current page text or the text of the current component, especially if it's a section with subcomponents), select "summary" and the intended length from the drop down menus. After pressing "Replace", the generated summary replaces the long text in the text field.
  4. Content Extension: The user has already written a portion of the content but needs help extending it. They open the content creation dialog, write a continuation prompt in the prompt field, and select the option to include the current text of the edited attribute. They then select the desired text length from the dropdown menu and press "Append". The generated content is appended to the existing content in the text field. Alternatively, they could choose "extend" from the list of predefined prompts. If the current text was to be replaced, since it was e.g. key points to be replaced by a full text, then the user would press "Replace".
  5. Content Improvement: The user is not satisfied with the written content and wants suggestions to improve it. They open the content creation dialog, write a prompt asking for improvements or select "improve" from the list of predefined prompts, and select the option to include the current text of the edited attribute. They then press "Replace", and the generated improved content replaces the existing content in the text field.
  6. Excerpt Generation: The user wants to create an excerpt from a longer piece of content. They open the content creation dialog, write a prompt asking for an excerpt, and select the option to include the current text of the edited attribute or the full text of the page. They then press "Replace", and the generated excerpt replaces the existing content in the text field.
  7. Idea Generation: The user is stuck on creating new content and needs some inspiration. They open the content creation dialog, write a general prompt in the prompt field related to the topic they need ideas on, and then press "Replace". The generated ideas or suggestions will replace the current text in the text field. They can continue to iterate on this until they find an idea that they like.

Dialog Elements

Given these workflows, the content creation dialog could have the following elements:

  • Prompt Field: A text field where the user can write a custom prompt for ChatGPT. It should be a text area that can contain multiple lines.

  • Predefined prompts: Dropdown menu with pre-defined prompts like "summary", "improve", "extend", "title generation", etc. that are suitable for various use-cases. Selecting this will replace the prompt field content.

  • Content Selector: A dropdown menu for selecting the content to include as additional input for ChatGPT. The options could include the current text of the edited attribute, the text of the edited component (and subcomponents, if applicable) and the full text of the whole page. An abbreviation of that content will be displayed as tooltip for the individual elements of the selection.

  • Text Length Selector: For textareas and richtext editors, a dropdown menu to select the desired text length. Options could include "one line", "one sentence", "one paragraph", "several paragraphs". This option is absent for textfields (a single line of content).

  • ChatGPT Response Field: A text area to display the content generated by ChatGPT. This should be a large, resizable area since the generated content can be quite long. We do not allow editing, since the user can edit that in the dialog he called the ChatGPT dialog from. The user can see the generated content here before deciding to replace or append it to the existing content.

  • History Navigation: A pair of "Back" and "Forward" buttons that allow the user to navigate through the history of generated texts for the current session. This way, they can easily compare different generated texts and choose the one they like best.

  • "Replace" Button: This button will replace the existing content of the attribute with the content generated by the AI.

  • "Append" Button: This button will append the content generated by the AI to the existing content of the attribute.

  • "Generate" Button: This button will generate the content based on the provided prompt and additional input settings. The generated content will be shown in the Preview Area.

  • "Cancel" Button: This button will close the dialog without making any changes to the existing content.

  • Loading Indicator: An indicator showing the current status of the ChatGPT response. This could be a simple spinner that shows when the API is processing and disappears when the response is ready. The stati could be: idle (= used also when done), processing.

  • Close/Cancel Button: A button to close the dialog without applying any changes. This is useful if the user decides not to use the generated content after all.

  • Alert: a normally hidden area that can contain error messages or warnings. The text will be shown in red, so a label is not necessary.

  • Help: opens a page with a description of the dialog, and some example usages.

Structure of the dialog

To build an intuitive and user-friendly interface for the dialog, it's crucial to structure the elements in a logical order that aligns with the user's workflow. This involves grouping the elements based on their function and arranging them in the sequence they are likely to be used. We order these dialog elements in the following groups below each other. Some groups have subgroups, which have an individual frame around them.

  1. Prompt Group: This group has the elements that the user interacts with to specify the prompt and the additional input for ChatGPT.

    1. Prompt details
      • Predefined Prompts
      • Content Selector
      • Text Length Selector
    2. Prompt Area:
      • Prompt Textarea (5 lines) with a label just above it.
  2. Generation Control: This group lets the user control the generation process. All buttons are arranged in a horizontal line with two subgroups.

    1. Generation Control: let aligned
      • Generate Button
      • Loading Indicator
    2. Content preview history navigation: right aligned
      • Back
      • Forward
  3. Alert: normally hidden.

  4. Content Preview: Allows the user to review the generated content.

    • ChatGPT Response Field, with a label just above it.
  5. Content Actions and Dialog Control: left aligned

    • Replace Button
    • Append Button
    • Cancel Button
    • Help Button (in addition to a help icon in the dialog frame)

The help and maximize buttons should appear as icons next to the close icon, all three right aligned on the top in the dialog frame.

Preview of the dialog.

+--------------------------------------------------------- [?] [□] [x] -+
|                                                                       |
| [\/Predefined] [\/Content Selector] [\/Text Length Selector]          |
|                                                                       |
| Prompt_Textarea______________________________________________________ |
|                                                                       |
| [Generate] [Spinner]            [Back] [Forward]                      |
|                                                                       |
| Alert_Text___________________________________________________________ |
|                                                                       |
| ChatGPT_Response_Field_______________________________________________ |
| ________________________________________________________             |
|                                                                       |
|                                                                       |
|                                                                       |
|                                                                       |
|                                                                       |
|                                                                       |
|                                                                       |
|                                                                       |
|                                                                       |
|                                                                       |
|                                                                       |
| [Replace] [Append] [Cancel]                                           |
|                                                                       |
|                                                                       |
+-----------------------------------------------------------------------+

Ascii-Art representing the dialog design.1

testimage

Suggestion for dialog design.2

[Help] [Maximize] [X]
--Predefined Prompts-- Summary Improve Extend Title Generation --Content Selector-- Current Text of Edited Attribute Text of Edited Component Full Text of Page --Text Length Selector-- One Line One Sentence One Paragraph Several Paragraphs
Prompt
<textarea rows="5" cols="50"></textarea>
Generate [Processing] Back Forward
Alert Area
ChatGPT Response
<textarea rows="10" cols="50" readonly></textarea>
Replace Append Cancel Help
A wireframe rendering of the dialog.[^3] See footnote for related prompt.[^3]

User interaction diagram

sequenceDiagram
    participant User
    participant Dialog
    participant ChatGPT
    User ->> Dialog: Open Dialog
    Dialog ->> User: Display Dialog Elements
    User ->> Dialog: Write a Prompt
    User ->> Dialog: Select Additional Input
    User ->> Dialog: Select Text Length
    User ->> Dialog: Press Generate Button
    Dialog ->> Backend: Send Prompt and Additional Input
    Backend -->> Dialog: Send Generated Text
    Dialog ->> User: Display Generated Text
    User ->> Dialog: Press Replace or Append
    Dialog ->> User: Update Text Field
    User ->> Dialog: Close Dialog
Loading

Saving State

Since content creation often takes a couple of attempts to get right, we want the user to be able to switch back and forth between the last results. Therefore we provide "back" and "forth" buttons to switch between previous states of the dialog and keep a history that contains the states of all dialog fields at certain points. There are several variants as to when the state should be saved:

  • On Generating New Content: Each time new content is generated, the current state of the dialog is saved. That has, however, the problem that when a user generates some content, changes it in the response field, and then hits ' Generate' again, that changed content in the response field would not be saved.
  • On Navigation: When the user navigates back or forth through the history, the current state of the dialog should be saved before the state is changed. That, however, has the problem that when a user generates some content, then edits the prompt field, and then hits 'back' or 'forth', then the changed prompt would be saved, which does not correspond to the result.

To alleviate these problems we combine those, which might generate an unpleasant number of states in the history, but avoids both mentioned kinds of problems:

  • Always save the state before it's changed: Whether the user hits 'Back', 'Forward', 'Reset' or the content generation was finished, always save the current state before anything is changed. (Of course only if it was modified in comparison to the last history entry.)

However, the dialog history should be cleared each time the content creation dialog is opened, as it is only meant to be a temporary history.

Implementation plan

The implementation of a dialog consists of the following parts:

  1. The dialog would be rendered with /libs/composum/pages/options/ai/dialogs/create/create.jsp (resource composum/pages/options/ai/dialogs/create in Apache Sling) from com.composum.ai.composum.bundle.AIDialogServlet and uses model model.com.composum.ai.composum.bundle.ChatGPTCreateDialogModel.

  2. The URL would be e.g. /bin/cpm/ai/dialog.creationDialog.html/content/ist/software/home/test/_jcr_content/create

  3. The created content is via an additional JSON AJAX request that is then forwarded to ChatGPT. The Javascript class CreationDialog in /libs/composum/pages/options/ai/js/chatgpt.js triggers the loading of the dialog

  4. The JavaScript class CreateDialog in /libs/composum/pages/options/ai/js/chatgpt.js triggers the loading of the dialog and the JSON AJAX call com.composum.ai.composum.bundle.AIServlet.CreationOperation with /bin/cpm/ai/authoring.create.json for the creation process.

  5. Necessary extensions:

    • com.composum.ai.composum.bundle.AIDialogServlet new operation creationDialog.
  • possibly changes to the backend if necessary.

For the selects that contain fragments of the prompt we set the value to the prompt fragment.

Identifiers etc.

We use the following identifiers:

  • {feature} = create
  • {resourcetype} = composum/pages/options/ai/dialogs/create
  • {dialogURL} = /bin/cpm/ai/dialog.createDialog.html
  • ID for dialog: chatgpt-create-dialog
  • HTML class for dialog fields:
    • Prompt Textarea: prompt-textarea
    • Predefined Prompts: predefined-prompts
    • Content Selector: content-selector
    • Text Length Selector: text-length-selector
    • Generate Button: generate-button
    • Loading Indicator: loading-indicator
    • Back Button: back-button
    • Forward Button: forward-button
    • Alert Text: alert
    • ChatGPT Response Field: ai-response-field
    • Replace Button: replace-button
    • Append Button: append-button
    • Cancel Button: cancel-button
    • Help Button: help-button
  • Parameter names for the inputs:
    • Prompt Textarea: prompt
    • Content Selector: contentSelect
    • Text Length Selector: textLength

automatically added:

ChatGPTDialogServlet "Add a third operation creationDialog for resourcetype .../dialog/create"

Test cases

Some informal testcases:

  1. New Page Creation Workflow:

    a. Create a new page with some text content. b. Open the dialog for setting page categories.

    • Test: Verify that the "Loading Indicator" is displayed while waiting for the category suggestions from ChatGPT.
    • Test: Verify that the "Current Categories Section" is not visible since no categories have been set yet. c. Wait for the category suggestions from ChatGPT.
    • Test: Verify that the "Loading Indicator" disappears when the suggestions are ready.
    • Test: Verify that the "ChatGPT Suggestions Section" is displayed with the category suggestions from ChatGPT. d. Select some of the suggested categories and click the "Accept" button.
    • Test: Verify that the selected categories are saved to the page. e. Reopen the dialog for setting page categories.
    • Test: Verify that the "Current Categories Section" is now visible with the categories selected in the previous step.
  2. Existing Page Editing Workflow:

    a. Open an existing page with some categories already set. b. Edit the page content and save the changes. c. Open the dialog for updating page categories.

    • Test: Verify that the "Loading Indicator" is displayed while waiting for the category suggestions from ChatGPT.
    • Test: Verify that the "Current Categories Section" is visible with the categories previously set. d. Wait for the category suggestions from ChatGPT.
    • Test: Verify that the "Loading Indicator" disappears when the suggestions are ready.
    • Test: Verify that the "ChatGPT Suggestions Section" is displayed with the category suggestions from ChatGPT. e. Deselect some previously used categories, select new relevant categories from the suggestions, and click the " Accept" button.
    • Test: Verify that the updated categories are saved to the page. f. Reopen the dialog for updating page categories.
    • Test: Verify that the "Current Categories Section" is now updated with the categories selected in the previous step.
  3. Error Handling:

    a. Open a page with a large amount of text that could potentially take a long time to analyze.

    • Test: Verify that the "Loading Indicator" is displayed and remains visible for as long as the analysis is ongoing. b. Simulate a failure in the ChatGPT category suggestions.
    • Test: Verify that an appropriate error message is displayed in the "Alert" section.

Possible extensions

Likely extensions

These are very recommendable, but lower priority and have some effort.

  • The title attribute of the loading indicator should show the last actual request sent to ChatGPT, for transparency and debugging. A click on it could open a full screen read only dialog showing the complete request, scrollable.

Not planned for now

These ideas might or might not make sense - that's best reviewed after the feature is implemented and has been used.

  • Request several variants simultaneously
  • temperature setting
  • selection of desired tone and writing style, like in Superpower ChatGPT
  • save parts of prompts for reuse (e.g. tone, writing style, general comments about the site, definitions, slogans)
  • Some kind of templates: predefined structure descriptions for specific functions of the text
  • Advanced Text Editing: Incorporate features such as grammar and spelling checks, readability analysis, and style suggestions, check whether it fits the intended tone
  • More kinds of history, to go back to previous suggested variants and / or prompts
  • automated linking to other pages / external content
  • content suggestions reviewing the whole page.
  • ai powered image selection
  • The dialog should save the last used settings (e.g., the chosen additional input and desired text length) for the next time the dialog is opened. Not clear here to what extent: that depends on both the attribute, the component type and the component instance.

Glossary

  • Component: A reusable building block in Composum Pages, which can contain text attributes.
  • Attribute: A property of a component, which can contain text.
  • Dialog: In this context, the user interface for editing component attributes, and the proposed interface for interacting with the ChatGPT API.
  • ChatGPT Completion API: The API used to generate text from prompts.
  • Prompt: A text input that guides the AI in generating a specific type of text.
  • Replace/Append: The actions to take with the generated text. Replace will change the current attribute text with the generated text, and append will add the generated text to the end of the current attribute text.
  • Textfield/Textarea/Richtext Editor: Different types of input fields for text in Composum Pages.
  • Text Length: A user-specified guideline for how long the generated text should be.
  • Additional Input: Existing text that is used to give context to the AI when generating text.
  • History: A record of generated texts for a specific attribute during a session of the Content Creation Dialog.
  • Session: A single use of the Content Creation Dialog, from opening to either replacing/appending text or closing the dialog.
  • Loading Indicator: A visual signal to show when the AI is processing a prompt and when it is ready.
  • Alert: An area to display error messages or warnings.

Footnotes

  1. ChatGPT prompt to create that drawing: Please create an ascii art of the dialog, rendered as markdown code block with 4 spaces indentation. Buttons should be rendered like [Cancel] when "Cancel" is the text on them, so that the layout is nicely shown. Drop down lists can be rendered like [/Predefined]. Text fields, Text areas should be shown with a description what is in there, spaces rendered as _, and with more _ showing the full space they occupy. (For text areas that will be several lines.) Otherwise the dialog should look as closely as ascii art can make it to the fully implemented dialog. The names of groups and subgroups should not be shown, except if they should appear in the fully implemented dialog. No explanation is necessary, please render just a drawing of the dialog in a ascii art code block.

  2. ChatGPT prompt to create that drawing: Please create a code block with a SVG representation of the dialog, that could be rendered by a browser to display a suggestion for the dialog. The dialog should have a frame, group subgroups also with a small frame that surrounds the group of buttons etc. The names of groups and subgroups should not be shown, except if they should appear in the fully implemented dialog. The text fields and text areas should be rendered as a frame, with a descriptive text shown inside. Render buttons and drop down lists with a frame, and indicate with a suitable symbol the drop down list. No explanation is necessary, please render just a drawing of the dialog in a SVG code block. Please output only the svg tag and the svg elements, no comments, and take care to create a valid SVG including the xmlns declaration.