Take-aways From OpenAI’s First Dev Day
OpenAI has recently introduced new features in ChatGPT Plus, notably GPT4Turbo, which offers an expanded context window, making it ideal for more extensive tasks. Alongside, the introduction of specialized GPTs marks a significant evolution. Previously, distinct tasks required separate tabs, with each conversation serving as the context. Now, users can create a bespoke ‘GPT’ tailored to specific needs. Additionally, these advancements streamline processes that previously relied on basic API calls combined with user prompts, exemplified by applications like a ‘professional LinkedIn writer assistant’. In this article, we’ll delve deeper into the potential impacts of these innovations on SaaS applications.
Text-to-Speech and Speech-to-Text Implications
Accessibility, often referred to as A11Y, stands as a pivotal application for AI technology. The integration of text-to-speech and speech-to-text functionalities significantly broadens the scope of GPT usage across various scenarios. From facilitating hands-free operations during phone calls to providing voiceovers for content and reading aloud PDF summaries, these features enhance user interaction with digital content. As we navigate through an ever-growing expanse of online information, the ability to summarize, evaluate quality, and vocalize content is set to redefine our web searching experience in the coming years. Moreover, these advancements hold immense potential in aiding individuals with disabilities, making online productivity more accessible and efficient.
Today, I needed to create a frontend form with a password input, featuring a toggle icon to show or hide the password, and a visually appealing image on the right side to enhance the signup page’s design. This task required a bot that could handle text, code, and images seamlessly. Multi-modality proved invaluable here, allowing me to manage all these elements in one place without juggling different services or tabs. This example highlights the practicality and efficiency of multi-modality in streamlining such creative tasks.
JSON Mode API
I often describe frontend development as ‘dressing up a JSON’. Now, with web access akin to Bing’s capabilities, envision a bot that not only responds in JSON but also allows us to specify the fields we want in the response. With such tool we already have a natural language API, whose DB is the whole training data. For a developer this is the most important novelty revealed, and will change the web, we will be able to craft dynamic UI sections in a website, for dynamic inputs and use cases of a user.
GPTs: The tiny startup wrecking ball and how to tune them
When ChatGPT emerged, numerous programmers in their spare time launched what can loosely be termed ‘startups’ — often just a GitHub repository, a small team of 2–3 developers, and a paywall protecting various innovative ideas. Many of these ventures have now become obsolete, thanks to new features from OpenAI. This shift arguably serves the best interests of both users and OpenAI, bringing these initiatives under one platform with a revenue-sharing model.
However it’s hard to believe at the moment that this will be the new app development in the early 2010s, there will opportunities of course, but not very large ones, because creating a GPT is:
- Very easy, if you know english that’s more than enough
- Very fast, have ten minutes? You can create a decent one!
- They have some features and options but not too many
So there are going to be A LOT of them, I would guess about 10 times the number of paid ChatGPT users, and being able to have the spotlight on such crowd will be really hard, there will only two differentials, to paid GPTs, next month OpenAI will open their store,so the gold rush is about to get started :
- Your brand/name
- Your data quality
- The API you’re calling
Let me explain this further, by revealing a not so shocking card on my sleeve, I create a bot to help me write blog posts including this one, and will use it as an example
The name, description and picture just serve one purpose to display to others. The conversation starters are those suggestions down below, a nice onboarding to the bot so the user know the main use cases but not a huge deal.
The big deal are Instructions, Knowledge and Actions!
Knowledge allow you to upload up to 20 files, each one with up to 128K tokens(or around 300 pages), so you have 6000 pages of database and the bot will be a specialist on those. Can be a costumer service protocol, on boarding a new employee, a telemarketing guide, the technical specs of all products in a given company, all tourist attractions in a city, I can think of too many examples.
The instructions are just the context so the bot can know what’s all about, the create tab is a bot that fill this text for you, and you can edit it further, here is the Tech Blogger Pro example:
“Your name is Sara, Tech Blogger Pro, specialized in AI, Blockchain, Web Development, and Infrastructure, will clearly state its limitations when encountering topics outside its expertise. It will provide general guidance on such topics and, where possible, link to external resources for further information. This approach ensures accuracy and reliability in its responses, maintaining trust with its audience. The GPT will balance its comprehensive knowledge in its areas of expertise with a transparent and helpful approach when dealing with unfamiliar or speculative subjects.”
It’s about how to respond, a name I can call more easily, things like these.
And for last but not least Actions, which is the last button, here is a print if you ask a weather app example, is an object that guides the bot on retrieving external data source, so they’re a quite flexible tool, specially if you already have the data, and an API available.
The ease of creating GPTs has democratized AI, allowing even small teams and individuals to contribute to this evolving landscape. However, with this democratization comes the challenge of differentiation in a rapidly saturating market. The key to success in this new era lies in the uniqueness of your brand, the quality of your data, and the strategic use of APIs.
As OpenAI prepares to launch its store, we stand on the cusp of an AI gold rush. This wave of innovation will undoubtedly reshape how we interact with the web, create content, and develop applications. My personal foray into using a GPT for blog writing is just a glimpse of the vast potential these tools hold. We must now look forward to harnessing these capabilities responsibly and creatively, ensuring that we not only ride the wave of this technological revolution but also contribute to its direction and impact.