Bring OpenAI’s ChatGPT model in Azure to your own enterprise-grade app experiences with precise control over the knowledge base, for in-context and relevant responses. Interact with your organization’s private internal data, while respecting the information protection controls put in place.
Azure OpenAI service is combined with Azure Cognitive Search to index and retrieve data that is private and external to the ChatGPT large language model. The retrieval step in Azure Cognitive Search finds the most relevant pieces of information and presents the top ranked results to the language model. And because the knowledge lives outside of the ChatGPT model, you’re in control — it’s not used to train the model.
Include cited sources in generated responses.
ChatGPT works with the Azure OpenAI model and Cognitive Search to provide users what they need to validate responses. See a typical app experience in action.
Restrict information only to those involved.
Implement document-level granular user access with Azure Cognitive Search with the ChatGPT model. Check it out.
Add new information & get an updated response almost instantly.
Watch our video here.
00:00 — Introduction
01:29 — Apply ChatGPT to enterprise apps using Azure
03:40 — Demo: Typical app experience
05:45 — How ChatGPT generates a response
07:55 — Experiment with prompts
09:38 — How information protection works
11:03 — Process for adding new information
12:01 — Code behind the sample app
15:00 — Wrap up
Watch our OpenAI fundamentals show at https://aka.ms/OpenAIMechanics
Try out the sample app on GitHub at https://aka.ms/entGPTsearch
More on Azure Open AI service at https://aka.ms/azure-openai
Check out Azure Cognitive Search at https://aka.ms/azsearch
Unfamiliar with Microsoft Mechanics?
As Microsoft’s official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft.
- Subscribe to our YouTube: https://www.youtube.com/c/MicrosoftMechanicsSeries
- Talk with other IT Pros, join us on the Microsoft Tech Community: https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/bg-p/MicrosoftMechanicsBlog
- Watch or listen from anywhere, subscribe to our podcast: https://microsoftmechanics.libsyn.com/podcast
Keep getting this insider knowledge, join us on social:
- Follow us on Twitter: https://twitter.com/MSFTMechanics
- Share knowledge on LinkedIn: https://www.linkedin.com/company/microsoft-mechanics/
- Enjoy us on Instagram: https://www.instagram.com/msftmechanics/
- Loosen up with us on TikTok: https://www.tiktok.com/@msftmechanics
- Today we take a look at how you can bring OpenAI’s ChatGPT model in Azure to your own enterprise-grade app experiences, so that you can interact with your organization’s private internal data, while respecting the information protection controls that you have in place, and along the way, we’ll deconstruct how it all works with a breakdown of ChatGPT prompts. And joining me again on the show to go hands-on with ChatGPT is Microsoft Distinguished Engineer, Pablo Castro. Welcome!
- Thanks Jeremy, it’s great to be back!
- And it’s great to have you back on the show. Now before we go hands-on with all the tech, it’s worth mentioning that since our last show on the topic, the Azure OpenAI service, it’s now generally available. So, this is the service that gives you programmatic access to OpenAI large language models to use with your own apps. The GPT model from OpenAI in the Azure service now adds support for chat interactions. In fact, if you missed our last show with Pablo, it’s worth checking out at aka.ms/OpenAIMechanics to learn more about the fundamentals of building prompts that guide the output of the OpenAI models as you build app experiences. Now, ChatGPT is of course one of the fastest adopted technologies in recent years. And at Microsoft, we’re in fact integrating OpenAI models with related experiences across the Bing search service, GitHub Copilot for AI generated code, and as recently highlighted, the Microsoft 365 portfolio of apps with Copilot, just to name a few. So Pablo, what potential do you see then in terms of applying ChatGPT to enterprise-grade applications on the Azure service?
- Well, it’s pretty exciting. We can now build applications that combine the ChatGPT model with your own data. This can transform not only the way we interact with apps but also our ability to effectively use vast amounts of data to answer questions, generate content, or any number of new and emerging tasks. ChatGPT is a new model optimized for conversational-style interaction, though it can be used for other tasks as well. It uses a particular convention and syntax to denote turns between the user and ChatGPT as “assistant” in a conversation. And while it’s trained on public data, we can construct prompts that include both instructions and additional data to generate responses. So, imagine taking ChatGPT and applying it to your own data but with precise control over the knowledge base for in-context and relevant responses. We can do that using an approach often called “Retrieval Augmented Generation”. In this case, we combine the Azure OpenAI service with Azure Cognitive Search to index and retrieve data of all kinds, knowledge that is private and external to the ChatGPT large language model. The retrieval step in Azure Cognitive Search finds the most relevant pieces of information, even if it’s millions of documents or data points and presents the top ranked results to the language model, and this lets you have detailed informed interactions with your data. And because the knowledge lives outside of the ChatGPT model, you’re in control of it, and it’s not used to train the model. And equally important from an enterprise perspective, any chat session state lives entirely within your application. And whether you keep it or not and where is fully up to you.
- Just to clarify, by “private knowledge”, we mean only data that exists within your organization or your application’s boundary.
- That’s right, or equally you could be a SaaS vendor that wants to enlighten your application to provide an in-context conversation or content generation experience for your customers by using the data you manage for them. So, you’re just using the large language model’s understanding and reasoning capabilities and building your own app experience around it.
- That makes a lot of sense. So now we’ve had all the context, so can you show us an example of how this would work in an app?
- Sure, we’ll walk through a typical app experience that you can build around ChatGPT. I have here a sample Human Resources web application. By the way, we’ve published the code for this whole app, including the UX, the backend, and the sample data in a public GitHub repo at aka.ms/entGPTsearch for you to try it out or use as a starting point for your own apps once you’ve watched the show. So, this app lets employees generally chat about topics related to their employment benefits and employee handbook. In this case, I want to ask about healthcare coverage, which can be unique by plan, location, individual, etc. So, I’ll type “Does my plan cover annual eye exams?” In the generated response, you will see that not only does ChatGPT use the knowledge necessary to derive a response but as a best practice also cites its sources. That’s because a key area we are actively exploring is how do we make responses trustworthy? These models are not perfect, so we see this as a collaboration between the user and the app where the app reads through millions of data points and picks a few to formulate an answer. In this case it showed the source it used for the facts, enabling the user to validate the response generated from the “Benefit Options PDF” source if needed. As I mentioned earlier, we achieve this by coordinating the Azure OpenAI model, Cognitive Search, and how we pre-process the data. Now, I’ll type a follow-up question. “How about hearing?” While the question in isolation wouldn’t make sense, in the context of the chat history, it can figure out what information it needs to answer. For each response, I can see the preview of the file from the citation. And it also shows all the supported content used to formulate the response. And so, in a way, what we’re doing here is we’re conditioning ChatGPT to produce source citations in its response, which then helps you to validate them. So, can you explain how then this was able to figure out the response?
- Sure, we wrote this app, so it would expose the details of what happens in each turn of the conversation. Let’s go back to our follow-up question, “How about hearing?” which, as I mentioned on its own, doesn’t present enough context to be answered in isolation. When I click on this light bulb, I can see the process it went through to provide a response. Here we can see that GPT first takes the chat history and the last question to produce a good search query. You can also see the rest of the prompt, including some of the mark up used to tell ChatGPT where turns are. By the way, there’s also a new API that has more structure around this, so you don’t have to construct the prompts with markup manually. Finally, you can see the “sources” part, which is where we inject the fragments of documents we recalled from the search index. And here’s the final prompt we send to ChatGPT to generate a complete response. By the way, here we used the “Retrieve-Then-Read” approach for generating responses. This is an easy-to-understand approach that can be effective in simple cases. That said, we’ve explored other approaches we include in the app that you can experiment with. For example, “Read-Retrieve-Read” would present the model with a question and a list of tools it could select from, such as “search the knowledge base” or “lookup employee data”. So here, it would decide to search the knowledge base to find out more about the healthcare plan, it would see that there are different plans, and so to determine which plan applies, it then would look up the employee information using another tool to finally arrive at an answer. Another approach is what we call “Read-Decompose-Ask” which would follow this “chain-of-thought” style of prompting. This would break down the question into individual steps of the thought process and answer intermediate sub-questions to accumulate partial responses until it arrived at a complete answer that it would be ready to send back to the user.
- What I really like about this app sample is it really helps you to deconstruct the fundamentals for building these types of app experiences. To that point though, with the logic included into this template, presumably you can tweak the prompts more, right?
- You can. Beyond the user experience, the template lets you easily experiment and configure exactly how responses are generated and channel what a user would expect to see. As a developer, you can include additional instructions using a prompt override. For example, this can influence the tone or style of a response. So, I can change the style of the response. Just for fun, I’ll make it to answer like a pirate. I’ll copy this question and start a new session using this override and paste the question again. And once the new response is generated, you can see it responds like a pirate might, I guess. In all seriousness, this is very powerful because you can easily adapt the style of response. For example, you might want to format the responses to be more concise for a mobile device or add structure. One of my favorite uses of this is to have it generate a response that can use formatting to better organize the information if requested. I’ll paste some text to inject a few more instructions into the prompt to do that. Let me start another chat session and this time, I’ll use this suggested question to compare two healthcare plans. You’ll see it gives me this long answer at first. And now I can ask it to summarize the response in a table and it gives me this nice table comparing the two plans. These are just a few examples. You could do any number of things, even switching responses to a person’s native spoken language on the fly, it’s up to you and the application experience you want to create.
- So, these stylistic and near real-time format changes are a big part of GPT and can also help make information a lot more accessible. Switching gears though a bit, I know a lot of people are probably wondering how information protection in this case works. So, can we talk about how you might make sure that the information that’s surfaced is indeed what that user is allowed to see?
- So, since the data of the knowledge base is in an Azure Cognitive Search index, there are a variety of building blocks for security and filtering you can use for access control, like implementing document-level granular access control. So, let’s demonstrate this together. In this case, let’s say our organization is working on an office move to optimize collaboration between teams. Information about the project is restricted to only those of us involved. I’ll type “Is there a move or office space change coming?” And the generated response says, “I don’t know.” Because there are no documents I can see that talk about this. The information available in the sources does not have an answer. So Jeremy, why don’t you try this out on your laptop?
- Sounds good! So, here I have actually got the app open on my Surface. I’ll go ahead and type in the same question. “Is there a move or office space change coming?” And I can immediately see from the generated response that there is a move indeed happening and it’s also cited the source: a plan for the office move that’s called “TheShuffleProject.pdf”
- Right. So, the information is restricted to those involved for now. I’m not part of it but you’re actively part of the project team, so you have access to this information.
- Okay, so in this case, you’ve actually written the solution to make sure that only people can see what they’re allowed to see. So, why don’t we switch gears again though and talk about the process for adding new information into search. So, how long does something like that take?
- Well, let’s try it out. This time I’ll ask something a little bit more random, like “Can I get scuba gear covered by my benefits?” And you can see, it found documents focused on benefits coverage but says it doesn’t have information on scuba gear being covered. I have this script here. And in a real application, this would be an automated process that runs as data changes but here I need to make it happen on the spot. I’m going to manually run this script and add the information on a new benefits plan into our knowledge base. So, now with the new content added, I’ll ask the same question again. And you can see with the new knowledge available to it, it generates an updated response.
- And just to be clear, we didn’t speed anything up in terms of using any Movie Magic. The response actually reflects the change almost instantly after Cognitive Search has access to the new information. So to understand the logic a little bit better, can you show us then the code that’s running behind your sample app?
- Sure. As part of the sample, we included notebook versions of the interesting parts of the backend. The nice thing about showing this in a notebook is that you can see how the state from a chat session is passed from one prompt to the next. Here we are in the notebook in Visual Studio Code. You can see that everything is wired up to Azure services with the right sources, and managed identities to authenticate. Then it create variables for parts of the prompts. So now, let’s run the whole thing to the end for a first conversation turn. And here we can see the output. Now, let’s do a second turn but this time we’ll do it step by step. First let’s check the history and we can see the previous user question and the response from the model. Now I’ll update the question in line with “How about hearing?” And it is first sent to GPT to map the history and context and generate a search query. From there it uses cognitive search to run a query and get candidate documents. Next, I’ll pull up the content and this was returned from our top search results. And this is really the magic with ChatGPT. We can see the prompt evolves with each interaction. This pattern uses everything to construct a prompt: the user question, the chat history, the search results, all to make one big prompt with instructions. Then in our final step, it calls the Azure OpenAI completion API to get a response based on the entire prompt. You can see that the session history is just kept in memory in this case. The model itself doesn’t track it and it’s up to your application. You could choose to store it or, like we do here, simply let it go when you close your session. In either case, the session history is not added to the large language model. Now, I showed you some prompt experimentation in the notebook. Now, if you don’t want to experiment using the notebook or using the code, you can also use the Azure OpenAI Studio Playground to experiment with ChatGPT prompts interactively. The GPT-3.5 Turbo model has been added to the playground along with a new chat interface and all the configuration parameters.
- And by the way, you can watch our entire show at aka.ms/OpenAIMechanics to learn about how to use Azure’s OpenAI Studio. Now, you also mentioned that this is based on GPT-3.5 Turbo but with the release of GPT-4, how does change the approaches that we’ve shown today?
- Indeed. We were running the sample on GPT-3.5 today because that’s the model that most people will have access to. That said, everything we’re discussing here applies to GPT-4 as well where some scenarios will perform similarly and others will work much better, thanks to its advanced reasoning capabilities and a much prompt length limit.
- And apparently, it can also even pass the bar exam. So, what else would you recommend then for anyone who’s watching who’s looking to build out their own enterprise-grade ChatGPT enabled apps?
- So first, try out the sample app I demonstrated earlier. You can find it on GitHub at aka.ms/entGPTsearch. You can find it on GitHub at aka.ms/entGPTsearch. The sample has everything you need to get started, including creating the Azure services and even the sample data we used. In just a couple of hours, you can have a version of what I showed you running with your own data. Then, to learn more about the Azure OpenAI service, you can go to aka.ms/azure-openai. And for Azure Cognitive Search, check out aka.ms/azsearch.
- Pablo, it’s always a pleasure having you on the show. And also, don’t forget to subscribe to Microsoft Mechanics, the latest in tech updates. Thanks for watching, we’ll see you next time!