Wednesday, 15 November 2017

Dropbox releases PyAnnotate -- auto-generate type annotations for mypy

For statically checking Python code, mypy is great, but it only works after you have added type annotations to your codebase. When you have a large codebase, this can be painful. At Dropbox we’ve annotated over 1.2 million lines of code (about 20% of our total Python codebase), so we know how much work this can be. It’s worth it though: the payoff is fantastic.

To easy the pain of adding type annotations to existing code, we’ve developed a tool, PyAnnotate, that observes what types are actually used at runtime, and inserts annotations into your source code based on those observations. We’ve now open-sourced the tool.

Here’s how it works. To start, you run your code with a special profiling hook enabled. This observes all call arguments and return values and records the observed types in memory. At the end of a run the data is dumped to a file in JSON format. A separate command-line utility can then read this JSON file and use it to add inline annotations to your source code.

Once the automatic annotations have been added you should run mypy on the updated files to verify that the generated annotations make sense, and most likely you will have to tweak some annotations by hand. But once you get the hang of it, PyAnnotate can save you a lot of work when adding annotations to a large codebase, and that’s nothing to sneeze at.

PyAnnotate is intended to ease the work of adding type annotations to existing (may we say legacy”) code. For new code we recommend adding annotations as you write the code type annotations should reflect the intention of the author, and there’s no better time to capture those intentions than when initially writing the code.


Here’s a little program that defines a gcd() function and calls it a few times:

def main():
    print(gcd(15, 10))
    print(gcd(45, 12))

def gcd(a, b):
    while b:
        a, b = b, a%b
    return a

We also need a driver script:

from gcd import main
from pyannotate_runtime import collect_types

if __name__ == '__main__':
    with collect_types.collect():

Now let’s install PyAnnotate and run the driver script. The standard output is just what the main()  function prints:

$ pip install pyannotate
$ python

At this point, if you’re curious, you can look in type_info.json to see what types were recorded:

        "path": "",
        "line": 1,
        "func_name": "main",
        "type_comments": [
            "() -> None"
        "samples": 1
        "path": "",
        "line": 5,
        "func_name": "gcd",
        "type_comments": [
            "(int, int) -> int"
        "samples": 2

Let’s go ahead and annotate the file (the -w flag means go ahead, update the file”):

$ pyannotate -w
---        (original)
+++        (refactored)
@@ -1,8 +1,10 @@
 def main():
+    # type: () -> None
     print(gcd(15, 10))
     print(gcd(45, 12))
 def gcd(a, b):
+    # type: (int, int) -> int
     while b:
         a, b = b, a%b
     return a
Files that were modified:
$ mypy

Since the original file was so small, the unified diff printed by PyAnnotate happens to show the entire file. Note that the final command shows that mypy is happy with the results! If you’d rather not see the diff output, use the -q flag.

Where to get PyAnnotate

The full source code for PyAnnotate is on GitHub:
If you’d rather just install and use it, a source distribution and a universal wheel” file exist on PyPI:


PyAnnotate doesn’t actually inspect every call the profiling hooks have a lot of overhead and a test scenario for a large application would run too slowly. For details on the downsampling” it performs see the source code. For the same reasons it also doesn’t inspect every item of a list or dictionary in fact it only inspects the first four. And if a function is called with many different argument types, it will only preserve the first eight that differ significantly.

The generated annotations are only as good as the test scenario you’re using. If you have a function that is written to take either an integer or a string, but your test scenario only calls it with integers, the annotations will say arg: int. If your function may return None or a dictionary, but in your test scenario it always returns None, the annotation will say -> None.

Annotations for functions with *args or **kwds in their signature will likely require manual cleanup.

PyAnnotate currently doesn’t generate Abstract Base Classes (ABCs) such as Iterable or Mapping. If you’re a fan of these you will have to do some manual tweaking to the output.

Since NewType and TypedDict are invisible at runtime, PyAnnotate won’t generate those.

Because of a limitation of the profiling hook, the runtime collection code cannot tell the difference between raising an exception and returning None.

Code in __main__ is currently ignored, because the module name doesn’t correspond to the filename.

The current version of PyAnnotate has been designed to work with either Python 2 or Python 3, but we’ve only used it for Python 2 code ourselves (since that’s what we have) and the tool currently generates type comments that are compatible with Python 2. We’re happy to accept a pull request adding the capability to generate Python 3 style annotations though.

Surely there are some bugs left in our code. We’re happy to accept bug reports in our tracker and bug fixes as pull requests. Note that all code contributions require filling out the Dropbox Contributor License Agreement (CLA).

We consider the current release a prototype, and we don’t promise that future versions will be backwards compatible. In fact, we hope to find the time to make significant changes. We just didn’t feel we would do anyone a favor by sitting on the code longer than we already have.


  1. o تولي شركة ركن سيف تنظيف فلل بالرياض أهمية كبري لتنظيف المطابخ ودورات المياه الملحقة بالمجالس
    شركة تنظيف مجالس حيث أنهما من الأجزاء التي تحتاج إلى عملية تنظيف بشكل دقيق عن غيرها لأن الإهمال في القيام بتنظيف المطابخ ودورات المياه يؤدي إلى تعرضهما للآفات والحشرات الضارة التي قد تتسبب في أمراض كثيرة للإنسان ويتم ذلك من خلال استخدام أفضل مواد التنظيف التي تم تجربتها مسبقاً وأثبتت نجاح منقطع النظير لدي جميع العملاء .
    ( فى الاحساء شركة تسليك مجارى)
    o تعمل شركة ركن سيف تنظيف فلل بالرياض علي القيام بمختلف خدمات التنظيف وذلك في مقابل أرخص الأسعار الموجودة بين جميع الشركات التي تقوم بنفس الخدمة فلو قارنت بين أسعارنا وأسعار الآخرين سوف تشعر بالفارق الهائل وذلك في ظل خدمة متميزة وبأعلى مستوي من الجودة فنحن نهتم بإرضاء مختلف أنماط العملاء الكرام . (شركة ركن سيف)
    o فإذا كنت ممن يعانى من مجهود تنظيف الفلل أو الشقق أو المنازل وجميع الأماكن التي تحتاج إلي خدمات التنظيف بشكل دوري شركة ركن سيف تنظيف فلل بالرياض تقوم بذلك بمنتهي التميز .
    شركة تنظيف بالاحساء
    لماذا تعتبر شركة ركن سيف تنظيف فلل بالرياض أفضل شركة في مجال خدمات التنظيف ؟ :
    o تقوم شركة ركن سيف تنظيف فلل بالرياض بأعمال التنظيف لجميع أنواع المفروشات مثل الموكيت والسجاد والكليم والستائر وغيرها من المفروشات داخل الفلل أو الشقق أو المنازل حيث تتم عملية التنظيف الخاصة بها من خلال أجهزة البخار التي تعمل علي إزالة أصعب البقع والأوساخ .