Working With Large Language Models: Powerful Tools, New Risks

Michael DeBellis
21 hours ago
7 min read

LLMs enable new ways for users and software to collaborate. This makes them incredibly powerful, but also introduces new kinds of risks that we haven't experienced with previous software tools.

I have been consistently optimistic about Large Language Models. That is not because I think they are perfect. It is because I have spent enough of my career working with AI, knowledge representation, and natural language processing to know how hard the problems they solve are. The first time I saw a modern LLM carry on a useful technical conversation, help debug code, summarize a complex topic, generate examples, and then revise its answer based on feedback, I had the same reaction I still have: this is extraordinary. It is not magic. It is not human intelligence. It is not a substitute for expertise. But it is one of the most powerful productivity tools I have ever used.

That said, the LLM critics raise important issues.

In several previous posts, I have emphasized the upside: LLMs as accelerators for writing, coding, design exploration, ontology development, documentation, and research. I still believe all of that. But LLMs can also encourage bad habits. They can make it easier to stop thinking carefully. They can give plausible answers when what we need is disciplined analysis. They can make us confuse a useful conversational workspace with a reliable system of record.

A recent example makes this point very clearly. In a Nature career column, Marcel Bucher, a professor of plant molecular physiology at the University of Cologne, described losing two years of structured academic work after changing a ChatGPT data-consent setting. The work included materials related to grants, teaching, publications, lectures, exams, and student-response analysis. According to Bucher, when he disabled the setting, his saved chats and project folders were emptied, and the material could not be recovered.

One of my favorite YouTube creators, Angela Collier, discussed this in her interesting and amusing video This is what 2 years of ChatGPT does to your brain:

https://www.youtube.com/watch?v=7pqF90rstZQ

It is an alarming story, and I think it illustrates something important. But to me, the lesson is not “don’t use LLMs.” The lesson is “don’t use an LLM workspace as if it were your source repository, your document management system, your Integrated Development Environment, or a replacement for your critical thinking.”

That distinction matters.

LLMs are extremely useful as collaborators. They are not replacements for other kinds of tools. A conversation with an LLM may feel like a persistent workspace, especially when the interface supports projects, folders, memory, file uploads, and long-running context. But that does not make it the same thing as GitHub, Google Drive, or a file system with backup and recovery. If the only copy of important work is in a chatbot history, the problem is not an LLM problem. It is a workflow problem.

I have made my own version of this mistake, though in a smaller and less catastrophic (but still embarrassing) way.

A couple of weeks ago, I was debugging a Python problem in a Streamlit interface for a Retrieval Augmented Generation (RAG) system. The application was calling AllegroGraph SPARQL “magic properties” that integrate knowledge graphs with the OpenAI APIs. Instead of first looking carefully at the error message myself, I copied and pasted it into ChatGPT and asked for help.

That usually works. In fact, it works so often that it it made me lazy.

The problem was in code that created the connection object to AllegroGraph:

conn=ag_connect(repo='streamforge_data_catalog', host=os.getenv("AGRAPH_HOST", "localhost"), port=int(os.getenv("AGRAPH_PORT", "10035")), user=os.getenv("AGRAPH_USER"), password=os.getenv("AGRAPH_PASSWORD"))

The correct repository name for that project was 'streamforge_data_catalog'. But in my actual code, I had the repository name from a previous project.

That is a dumb error. It is exactly the kind of error that normally would be one of the first things I would check when debugging without LLM help. But because I sent the problem directly to the LLM without first doing my own basic inspection, the conversation went down a lot of blind alleys and I wasted close to a full work day. I was convinced I had found a bug in the AllegroGraph SPARQL magic properties and was about to send an email to support when, luckily, I (for the first time) took a close look at the error message. As I did, I hit my forehead and realized from the error message what the problem probably was. Sure enough, when I looked at the code where I created the connection object I saw the error.

That was not the LLM’s fault. It was doing what I asked it to do. Tracing from vectors to objects in the knowledge graph can be tricky and in the past, ChatGPT has helped me navigate those connections. The problem was that I had become lazy and hadn't given any thought to the error message but had just jumped to asking ChatGPT to debug it, assuming there was some issue with using SPARQL to match both vectors and related knowledge graph objects.

This is one of the risks of LLMs. They are so powerful, they can encourage us to get lazy. The danger is that we begin to rely on it in places where we should still be exercising our judgment or using traditional tools with features like backup and recovery or Python debuggers where we can view the stack and the context where the error occurred.

There is a deeper point here that I think is worth more discussion. Some of the errors LLMs make are frustrating precisely because they resemble errors that happen in human collaboration. They misunderstand context. They accept a mistaken premise. They try to be helpful before they have clarified the problem. They generate a plausible explanation that fits the surface of the request but misses the real issue.

In earlier eras of AI, systems failed in brittle, mechanical ways. They did not understand enough context to make subtle collaborative mistakes. Modern LLMs operate at a much higher level of abstraction. That is progress. But once tools begin operating at that level, a new class of errors becomes inevitable. They are no longer just syntax errors, parser failures, or missing database fields. They are errors of context, framing, assumption, and collaboration.

That does not make the errors acceptable nor inevitable. It means we need to understand what kind of tool we are using and how to best use it.

An LLM is not an Integrated Development Environment (IDE). It is not a backup service. It is not a substitute for expertise. It is a powerful assistant that can help us think, design, develop, debug, analyze, and critique. Used that way, it is transformative. Used carelessly, it can make existing weaknesses in our workflow worse.

So what does good use look like?

Important work should live outside the LLM in the appropriate (non LLM) tools. Drafts, code, data, diagrams, prompts, notes, and decisions should be saved in durable systems such as: file systems with regular backups, Git repositories, cloud storage, document management systems, and databases. The LLM can help create and refine those artifacts, but it should never be the primary place they exist. This is very important because IMO if you are just using the results of an LLM without editing those results you are probably missing important issues. Whether it is generating code, presentations, ontologies, or text, I always treat what the LLM generates as an initial version that I test and almost always change.
Do a first pass yourself before asking the LLM. For debugging, read the error message. Use the debugger. Then if you still can't figure out the problem ask the LLM for help. The prompt will be better, the answer will usually be better, and you'll avoid wasting time as I did because you're trying to solve the wrong problem.
Ask the LLM to critique rather than merely assist. I often use prompts such as: “Critique this design,” “Tell me what I’m missing,” and “Find flaws in this argument”. LLMs are designed to be helpful and engaging. In some contexts, that is what we want. But in technical and intellectual work, we need pushback and critical thinking, not just yes-bots that always agree with us.
Separate brainstorming from authority. An LLM-generated explanation may be useful, but it is not automatically reliable. For factual claims, especially current facts, or anything involving dates and versions, the output needs verification. When the LLM gives you links be sure to check them. I often find that links are broken or go to pay to play journals that I don't consider good sources. Just as with Agile development, iteration is the key to getting the best quality and that requires you to continue to do your part and not just rely on the LLM.
Preserve provenance. When using an LLM for research or technical writing, keep track of what came from where. Save links, source passages, citations, and intermediate notes. This is especially important for knowledge graph and ontology work, where provenance is not a nice-to-have feature. It is part of what makes the system trustworthy. But don't confuse this provenance with your final deliverables.
Use the LLM as a collaborator, not as a replacement for understanding. The best results come when the human still owns the problem. The LLM can accelerate the work, suggest alternatives, identify blind spots, and produce drafts. But the human still has to decide what is correct, what is relevant, and what should be preserved.

My view has not changed: LLMs are among the most important productivity tools we have seen. They can help with writing, programming, research, ontology modeling, documentation, explanation, and design. They enable me to do things in hours that previously required days.

But the critics make important points. LLMs can encourage over-reliance. They can make mistakes. They can reinforce bad assumptions. They can produce confident nonsense. And they shouldn't be used as a replacement for tools that do the basics like reliably saving and backing up your files or debugging code.

The right conclusion is not to reject them. The right conclusion is to use them better.

Treat the LLM as a useful but fallible collaborator. Keep your real work in durable systems. Ask for criticism. Verify important claims. Maintain backups. Read the error message before pasting it into the chat.

Used that way, LLMs make us faster, more creative, and more rigorous. But only when we remain responsible for the thinking.

LLMs enable new ways for users and software to collaborate. This makes them incredibly powerful, but also introduces new kinds of risks that we haven't experienced with previous software tools.

Comments