GitHub Copilot Context
How to provide coding guidelines for GitHub Copilot and influence GitHub Copilot suggestions.
In my project, I use naming conventions for models. I have different suffixes for different model types, and I subclass my models from various base classes.
It would be awesome if I could let GitHub Copilot know about my conventions, so it can suggest more accurate code right from the start.
Based on my experience, GitHub Copilot seems to make suggestions by considering the context beyond the current file. The developers of Copilot themselves have confirmed this, although they haven’t explicitly mentioned what exactly is included in this context. Here’s a discussion on this topic: GitHub Copilot Context Discussion.
By way of example, one technique we use today is to look for relevant code in neighbouring files (i.e. tabs open in the code editor). Imagine that you have two files open – one that defines a Class, and another with unit tests for that Class. By looking at both files, we have more context into the relevant code, which results in higher-quality suggestions.
So, the idea is to write down the coding guideline suggestions explicitly to see if Copilot can pick them up.
I’ve created some quite detailed guidelines on how to write models for the same entity, but depending on their purpose - upstream, database, or API models. In the instructions, I included an unusual item to add a docstring “ROMAN LOVELY [TYPE] MODEL” as a marker to see if the instructions are taken into account.
Tested version: GitHub Copilot 1.4.5.4049 on PyCharm (19 Dec 2023).
Here’s what I tried, what worked, and what didn’t.
- ❌ Doesn’t work. I tried creating coding guidelines in the markdown file (README.md, guidelines.md, etc.) and opening the file along with the code in a separate tab. Unfortunately, Copilot only uses code as a context.
- ❌ Doesn’t work. Another attempt was creating coding guidelines in the Python file as a string constant and importing that string constant. However, Copilot doesn’t try to parse the code and follow the imports. I guess that would be asking too much.
- ✅ Works. Finally, I found success by creating coding guidelines in a Python file as a string constant or a docstring and keeping the guidelines open in a separate tab while editing the code. Both string constants and module docstrings work, but I decided to stick to Python docstring.
Here’s how the suggestions look. As you can see from the screenshots, I have two tabs open. The tab “readme.py” contains the coding guidelines saved in a Python module docstring.
A few notes on these screenshots:
- Sometimes, I had to explicitly set the file name in the first line of the code to help Copilot choose the correct set of instructions.
- When creating a database model, even though it clearly followed the instructions, Copilot didn’t add the marker docstring to the response.
This is not surprising, as it’s well known how difficult it is to make LLMs follow instructions exactly, and in general, the larger the context, the higher the chance that the model forgets something.
Speculations:
- The number of open tabs should affect the quality of following guidelines because the precise readme instructions may get diluted by irrelevant context from other files. So, closing unnecessary tabs in the IDE should help Copilot generate better suggestions.
- It’s possible that if there are too many open files, Copilot only considers a subset of them for context (and maybe not even the entire files). This means that with too many open files, the instructions may be completely excluded from the context, and there’s no way to influence it other than closing irrelevant files.