On January 11th, 2020, Taiwan went to the polls to elect their next President and Legislature. For us at the News Lens, we formed a small team of designers, engineers, and editors to build this year’s election tracker.
On January 11th, 2020, Taiwan went to the polls to elect their next President and Legislature. In Taiwan’s short history as a democracy, this election was only the 7th presidential election that Taiwan has held, conducted after months of unrest in Hong Kong and an increasing trend of populism around the world. For us at the News Lens, we formed a small team of designers, engineers, and editors to build this year’s election tracker.
Before we started planning the technical details, we began by thinking about election coverage as a product: What do readers care about? What are their reading habits / usage patterns? Who is our audience?
For starters, we expected that 80% of our readers would be accessing our content from a mobile device, and the large majority would be looking for Chinese content, although we would also make a version for our English readers. We knew that we wanted to display the results as close to realtime as we possible could, and knew that we wanted to cover presidential, district legislators, party legislators, and indigenous legislators. We also knew that we wanted to create an experience that gave readers content for before, during, and after the election, and to make content that could be built upon and referenced in the future.
第一個考量是我們預期 80% 的讀者會用手機看我們的開票網站，再者，雖然大部分讀者是中文使用者，但也會有英文的讀者對台灣大選有興趣。我們也知道頁面需要最即時的資訊，包含總統、分區立委、不分區立委和原住民立委。而且我們也知道給讀者的體驗不只是選舉開票的過程，在選舉之前跟之後的資訊都很重要。另外，我們也想做出一些未來還可以持續沿用的元件。這開票網站總共花了差不多兩個月的時間完整。包含的團隊主要是一位工程師和一位設計師。
To begin, we launched our election coverage by doing an aggregation of political polls in Taiwan, giving readers a more holistic view of trends of support for each of the presidential candidates. This not only gave us a way to promote our election coverage, but also gave us a way to test our new, Svelte-based system for building interactive articles and work out some of the kinks before election day. In total, building this tracker took us about two months, with a dedicated engineer and designer working nearly full time on this project.
我們首先製作的選舉專題，是台灣民意調查結果的綜合呈現，讓我們的讀者能比較完整地了解每位候選人的支持度與支持度的變化趨勢。這個民調專題不但讓我們多增加一個方式推播相關的選舉報導，也讓我們在選舉之前可以測試數位報導的新系統。（下面有補充我們用的 Svelte 框架）
Given that the main interaction users make with the election tracker is through the map, the map had to be responsive to user interaction and data updates. To allow users to explore as much of the data as possible, we decided that our map would at have three different layers to interact with.
The topmost layer is the whole country, including the outer islands that are part of Taiwan’s jurisdiction, followed by 22 counties (縣市). Each of the 22 counties can be further divided into towns (鄉鎮區), and the smallest unit of locality which we did not display is the village (里).
最上層是全台灣的圖層，用 22 個縣市來呈現。各個縣市的下一圖層是鄉鎮區。而我們決定不要分到里的層級，因為我們怕讀者會搞不清，現在他在哪一層。
All of this map data is obtained from Taiwan’s government website in the form of a 114MB shapefile. Obviously, if we were to directly convert the shapefile to topojson (a format for browsers to draw maps, more on this later), we’d still be looking at a file that is ~90MB in size; far too big to have every user download to their device.
地圖圖資的原始檔案是台灣政府提供的 shapefile。很明顯地，如果我們直接把這個檔案轉到 topojson 給 D3 畫地圖，檔案大小大約 90 MB，這麼大的檔案會下載太久，也會影響網站呈現的效能。
Moreover, the map needs to be processed for special areas (areas that are part of Taiwan’s jurisdiction but may not be part of an official locality, such as 釣魚台) and relocation of the outer islands so that Kinmen and Lienchiang don’t make the map look too big. At first, I processed a few topojson files using QGIS and the shapefile obtained from the government, but after many failed attempts, (ie incorrect island relocation, missing districts, resulting file too large, too many details removed from the map during compression, etc) I decided to create an API endpoint that would process the map that I want on the fly.
還有一個考量是我們的地圖必須把特殊的地區整理出來，有一些屬於台灣控制但被管制沒有人居住的地方，例如釣魚台。地圖還需要把金門和連江移到離本島近一點，不然台灣地圖中間都會是海。一開始的時候，我手動用 QGIS 產生一些 topojson 的檔案，但因為每次做的都有一點不對（外島移的地方不太對、地區不對、檔案還是太大、壓縮的太過度等等），最後我決定用程式依據設定即時產生 topojson 地圖。
In essence, the API endpoint takes a few parameters:
基本上，這個 API 有幾個設定：
However, because legislative districts are not exactly correlated to any one of the existing locales, we had to separately process all of the locales of each of the legislative districts. Because a whole day of searching could not turn up a shapefile for Taiwan’s legislative districts, we decided to create the map ourselves based on which counties, towns, or villages, belonged to which district.
After constructing a reliable way to generate our topojson, the next question was figuring out how to display it in our browser. The naive approach is to simply render the topojson to a SVG. However, the problem with this is that SVGs use the DOM, which means that changes or user interactions with a complex SVG is slow and resource intensive. Add locale labels and mouse interactions and suddenly your SVG implementation slows to a crawl.
做完我們產生地圖的方式之後，我們下一個問題是怎麼顯示到瀏覽器裡。其實如果很單純得把 topojson 直接用 D3 顯示 SVG 是很容易，但用 SVG 對效能不是很好。因為 SVG 用的是瀏覽器的 DOM，這變成每次跟地圖互動或者更新資料時，都要用 DOM 去更新。再加上地區的標籤與滑鼠的互動，網頁就會突然變得很慢。
The alternative is to use a canvas for display and an invisible SVG for interaction. This allows us to avoid expensive DOM operations of maintaining and manipulating the color of each locale on the map and instead use the canvas API to paint our map, while maintaining a SVG just to keep track of when users interact. In my very unscientific benchmarking on my own machine, I found that SVGs needed to be about twice a simplified compared to canvas in order to achieve similar performance. More about this method here.
效能比較好的作法是用 canvas 顯示，但用 SVG 接互動的事件。這個做法可以避免 DOM 因為計算每一個不同地區的顏色和位置而變慢。大略的實驗之後，我發現用這種做法大約可以提昇兩倍的效能。如果想要更深入了解這個做法，可以參考這個教學。
The optimizations that I could have done on the map are endless. If we were to further optimize the map, I would recommend using some kind of WebGL implementation, potentially MapboxGL to make map rendering performance even better, but a canvas approach was good enough for this use case.
如果還有更多的時間，其實還可以繼續地優化我們的地圖，例如使用一種 WebGL 的套件、MapboxGL 等等，但我覺得這個專題用 Canvas 就夠好了。
The data for our election tracker was obtained from the Central Election Committee, which gives us a 3MB JSON file of all the data anytime it is updated. Because this JSON file is so large and can take up to 5 seconds to download from the Central Election Committee, we decided to only fetch it at a regular cadence and process / store it into a database we can control and scale.
因為我們的資料來源是中選會，每次串入的資料是一個 3MB 的 JSON 檔案。因為檔案尺寸這麼大，有時候會需要到五秒的時間才能完成下載，所以我們決定用 cronjob 一分鐘取一次，然後存在我們自己的資料庫裡（Google Datastore）。這樣我們才比較能應對網站的高流量。
To do this, we used a cron job that would fetch the latest data from the Central Election Committee every minute, and separate out presidential and legislative results before storing it into Google Datastore.
然後為了讓編輯們能加上當選註記，我們開了一個 Google Spreadsheet 讓編輯填寫。這個 Spreadsheet 直接連到我們的系統。這個做法後來有出問題，我會在下面一段解釋。
Then, in order to indicate when a candidate has won, we connected a Google Spreadsheet that allowed our editors to manually mark when a race had been won. The system would then use that data to display the winner on the frontend. As it turns out, this became a problem on the night of the election, which I expand more on in the “lessons learned” section.
In terms of electoral rules, Taiwan follows a simple majority-wins policy for electing it’s president, which is easy to display on a map. However, legislative races follow a fairly complex system:
This logic poses a few challenges. First of all, the topmost level we can display for district legislators is the district itself, because we can’t calculate the color for a county that has multiple districts with different races.
Second, because indigenous legislature races elect the top three candidates for each plains and highlands indigenous groups, simply coloring a district with the party of the first-place candidate does not reflect the true breakdown of the vote. Thus, we decided to take the top three candidates and add up the votes based on their political parties, then take the color of the party with the most votes.
Third, because of the gender rule for party votes, legislators of a party that get elected through this method do not necessarily correspond with the order that they are placed in the party list. As a result, we have to build an algorithm to determine skip over the next male legislator in the list if there are already too many men.
Another large element of our election coverage was the graphics that we post to social media such as Facebook and Instagram. Because we wanted to be able to generate social media optimized graphics as fast as possible, we needed a method faster than designers plugging in election results into graphic. To achieve this, we built an internal website that would dynamically generate SVGs with the latest data from the Central Election Committee.
我們選舉專題另外一個很大的部分是要提供在網站、FB 和 IG 要發布的圖表。如果等到選舉之後才請設計師出資訊圖表，他們一定會做得又趕又累。因此，我們寫了一個內部用的網站，可以即時用中選會的資料產生他們事先設計好的圖表。
To achieve this, designers would first come up with the design of the graphics in the form of an adobe illustrator document. Then we took those graphics and exported each one into a separate SVG. Because SVG files are a XML-based specification, we can port these into our web application in order to make content within the SVG dynamic.
我們的流程是先把設計師設計的圖表輸出成 SVG。因為 SVG 只是一個 XML 類型的檔案，我們可以很容易用網頁框架動態地產生 SVG。用 SVG 這樣操作的好處是用什麼程式語言或框架都可以。在我們的用途中，是將產出的 SVG 設定為可以下載成 SVG 以及PNG 兩種檔案。
The great thing about using SVGs in this way is that it is language and tool agnostic, which means you can use whatever existing templating language / framework you use to generate these graphics. In our case, we fitted it into an internal website that the social media team can directly access to download the generated SVG or a PNG format to be posted on social media platforms like Facebook and Instagram.
Another critical piece of election-preparedness was making sure that we knew everything that was happening with our web application at all times. To monitor the health of the application and make sure that everything scaled properly, we set up a grafana dashboard that kept track of server errors, response latency, upstream API connections, cache hit ratios, as well as the standard CPU, memory, and database metrics.
另一個很重要的準備是確定我們隨時都能知道系統健不健康，這樣我們才能夠在出事時，有更快的反應。為了系統監控，我們在 Kubernetes Cluster 上運用 Prometheus 和 Grafana 紀錄錯誤、回答速度、往外連線、cache hit 率和比較基本的 CPU、memory 和 DB 等等的資訊。
By having the web application directly report metrics to prometheus, it was much easier to get the fine grained metrics specific to elections that we cared about. Because we had taken the time to integrate key metrics into prometheus, when errors began to show up during the election, we were able to immediately know what was causing the error.
把 prometheus 的資訊套在程式上可以比較容易監控一些客製化的資訊。選舉當天 Google spreadsheet 出問題的時候，我們因為有套 prometheus ，所以馬上就知道是 Google 那邊的問題。
Although React has been possibly the most popular front-end framework to use these days, one of the questions I find interesting to ask during interviews is “Under what circumstances is React not a good framework to use? When would you recommend not using React?” This is a question that stumps many less-experienced engineers and people without a good grasp of what React really does in the browser under the hood. In fact, I believe that there are many situations under which React is not a good choice, and building interactives in a news environment is one of them.
最近在前端工程師的世界裡 React 變得很紅，但我最近常常很喜歡問的問題是「怎麼樣的狀況之下你覺得 React 不適合？什麼時候會建議不要用 React 開發一個專案？」這是一個需要比較有經驗的前端工程師才能回答的問題。我認為有很多狀況下 React 不適合。而媒體的特殊數位報導就是其中一個。
News organizations have a very different development and deployment cycle than tech companies do, requiring tighter deadlines, faster feedback loops, and more bespoke flexibility. Thus, I decided to use Svelte, giving us the following main advantages among many:
新聞對數位報導的開發需求，需要更優雅的程式設計去即時地傳遞與回應訊息，這是一件非常有挑戰性的事情。因此，我這次開發即時開票頁面使用的框架是 Svelte。為什麼要選擇用 Svelte，它有哪裡比 React 好？
If you’re interested in learning more about Svelte, there are many tutorials to help you get started.
The biggest problem we saw on election night was that the site was down for ~1 hour due to high traffic. Specifically, it wasn’t the servers on our end that could not take the traffic, it was our usage of Google Sheets that caused Google to lock our google spreadsheet when the flow of traffic became too large. After Google locked out requests to the spreadsheet, our system started responding to downstream users with 500 errors, causing the entire site to be inoperable.
我們在選舉當天遇到關鍵評論網的最高流量。這個高流量造成網站有差不多一小時的時間沒辦法使用。而問題是出在我們打 Google Spreadsheets 的流量太高，導致 Google 把我們的 Spreadsheet 封鎖，所以我們的 API 一開始有很大量的錯誤訊息。
The solution was to first disconnect the spreadsheet from our services, which brought both our service and the google spreadsheet back online. Then, we implemented a quick function that would prevent the service from requesting data from the google spreadsheet so often. This brought everything back online.
排除這個問題的第一步，是把我們系統跟 Google Spreadsheet 的連線斷開，讓 Google Spreadsheet 回復到正常。接下來，我們加了一個很簡單的邏輯，讓我們的程式不要那麼常去打 Google Spreadsheet。這個作法使我們的系統回復到正常狀態。
The lesson here is that, unless you fully load test your entire system before the election, you never know what problems will be caused when traffic increases by a few orders of magnitude. Often, third-party services such as Google Docs will have their own unwritten throughput and bandwidth limits.
No matter how much planning you do in preparation of the big day, incidents are always likely to happen and things will never go exactly according to plan. That said, after you’ve made all the preparation you can possibly make, the next important thing to plan for is how to respond when things do not go as plan. In our case, by dividing up responsibilities we had a basic level of responsibilities when things went awry, but it would have been more advantageous to rehearse a few scenarios.
Namely, when the site goes down, it is important to define who will fix the problem and who will report the status to the rest of the team. It is hard for one person to do both, lest the person trying to fix the problem becomes bombarded with too many inquiries about the current status and is unable to concentrate on fixing the problem.
In conclusion, I hope the process we used and the lessons we learned can be built upon by the community, allowing future elections and similar events to be faster, more efficient, and better experiences for our readers and our organizations. I’m sure that our processes will continue to evolve, and I hope we can continue to learn better workflows and collaboration processes.