Blog

Learning Google Protocol Buffers with Python - Part 1

This article is part of a series:

The blog post What I Learned from Quip on How to Build a Product on 8 Different Platforms with Only 13 Engineers explains how Quip managed to build products for 8 different platforms with only a 13-member team. It’s definitely something worth learning from.

A key concept from the post is Build once, use multiple times. It encourages minimizing the repetition of creating the same components, thereby increasing the reusability of components. The article also reveals that Quip heavily uses Google Protocol Buffers. By defining data structures using Google Protocol Buffers, automatic code generation can occur for reading and writing the same data structure across various languages or platforms. It can even act as a data exchange format to transfer between different platforms, reducing repetitive development costs and thus improving development efficiency.

With such a handy tool, let’s learn Google Protocol Buffers using Python!

Posted on  Oct 27, 2018  in  Python Programming - Advanced Level  by  Amo Chen  ‐ 5 min read

Python Module - jsonschema Part 3

This post is part of a series on the Python module - jsonschema:

In Part 2 of the Python Module - jsonschema, we covered the complex usage of types like number, string, array, and object. However, most examples focused on validating individual data types, while in practice, JSON data can often involve a mixture of multiple data types. For example, an array might contain object elements, and an object might include nested objects. Consider the following JSON data:

[
    {
        "user_id": 1,
        "preference": {
            "cooking": True,
            "fishing": False,
        }
    },
    {
        "user_id": 1,
        "preference": {
            "cooking": True,
            "fishing": False,
        }
    },
]

This section will introduce how to write JSON Schemas that are more practical for real-world use and easier to maintain.

Posted on  Mar 28, 2018  in  Python Module Recommendations  by  Amo Chen  ‐ 4 min read

Python Module - jsonschema Part 2

This article is a part of a series on the Python module - jsonschema:

In the previous article, Python Module - jsonschema Part 1, we introduced six data types defined by JSON Schema and covered some basic validation techniques.

In this post, we will dive deeper into more complex usages of several types, namely number, string, array, and object.

Posted on  Mar 23, 2018  in  Python Module Recommendations  by  Amo Chen  ‐ 6 min read

Python Module - jsonschema Part 1

This article is part of a tutorial series about the Python module - jsonschema:

JSON is currently one of the mainstream data interchange formats. However, if you want to validate the format of JSON data in your program, you’ll need to spend some effort writing validation code. Fortunately, JSON Schema simplifies the process of validating JSON formats. If you’re using JSON as the data exchange format for an API, you might consider using JSON Schema for validation.

JSON Schema is a vocabulary that allows you to annotate and validate JSON documents.

Posted on  Mar 20, 2018  in  Python Module Recommendations  by  Amo Chen  ‐ 5 min read

R File Manipulation

This post is one of the R language learning notes, focusing on file manipulation.

Posted on  May 30, 2017  in  R  by  Amo Chen  ‐ 1 min read

Connecting Windows to Virtual Machines in VirtualBox

Since Windows uses virtual network adapters to achieve VirtualBox’s network functionality, connecting from Windows to a virtual machine in VirtualBox requires some simple setup. This setup is known as “Port forwarding.”

Posted on  Dec 6, 2016  in  VirtualBox  by  Amo Chen  ‐ 2 min read

Recording Test Duration with Python pytest

Recently, I read an article titled Timing Tests in Python for Fun and Profit, which is well worth a read. The article discusses how to find test cases that need speed improvement by recording their execution time (and in the process, you might also discover code with poor performance).

However, I often use pytest, so I spent some extra time finding out how to achieve the same in pytest.

Posted on  Nov 20, 2016  in  Python Module Recommendations  by  Amo Chen  ‐ 2 min read

Python Tips - Saving Time and Effort with CSVDictWriter and defaultdict

Python’s CSVDictWriter is a handy module that allows you to write dictionary data directly into a CSV file. Here’s how you can use it:

import csv

with open('my.csv', 'w') as csvf:
    writer = csv.DictWriter(csvf, fieldnames=[
        'field_1',
        'field_2',
        'field_3'
    ])

    writer.writeheader()
    writer.writerow({'field_1': '', 'field_2': 'b', 'field_3': ''})
    writer.writerow({'field_1': '', 'field_2': 'b', 'field_3': ''})

While CSVDictWriter is convenient, it stops if any fields in a row are missing. This can be a hassle if you have many fields, and not all of them have values. Do you really need to manually fill in every single field?

Actually, you can solve this problem by using defaultdict.

Posted on  Sep 2, 2016  in  Python Programming - Intermediate Level  by  Amo Chen  ‐ 1 min read