<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>City Fish</title><link href="https://casyfill.github.io/" rel="alternate"/><link href="https://casyfill.github.io/feeds/all.atom.xml" rel="self"/><id>https://casyfill.github.io/</id><updated>2022-09-26T18:04:00-04:00</updated><entry><title>Dummy blogpost</title><link href="https://casyfill.github.io/dummy-blogpost.html" rel="alternate"/><published>2022-09-26T18:04:00-04:00</published><updated>2022-09-26T18:04:00-04:00</updated><author><name>Philipp Kats</name></author><id>tag:casyfill.github.io,2022-09-26:/dummy-blogpost.html</id><content type="html">&lt;p&gt;Ping ping! this is an initial dummy blogpost. Let's see how this will work on deployment&lt;/p&gt;</content><category term="blog"/></entry><entry><title>Dataframe Schema</title><link href="https://casyfill.github.io/projects/dataframe_schema.html" rel="alternate"/><published>2022-09-25T00:20:00-04:00</published><updated>2022-09-25T00:20:00-04:00</updated><author><name>Philipp Kats</name></author><id>tag:casyfill.github.io,2022-09-25:/projects/dataframe_schema.html</id><summary type="html">&lt;p&gt;&lt;strong&gt;DataFrame Schema&lt;/strong&gt; is a small and slim package that does only one job - it checks if a given &lt;em&gt;DataFrame&lt;/em&gt; fits certain list of defined expectation. It is heavily inspired by &lt;code&gt;jsonschema&lt;/code&gt;, has only one dependency (Pydantic) and tries to be as simple and small as possible.&lt;/p&gt;
&lt;p&gt;In our experience, we found that defining a very simple, explicit, and easy-to-generate definition of what we expect to get, is crucial in our day-to-day work.&lt;/p&gt;
&lt;h2&gt;Core Ideas&lt;/h2&gt;
&lt;p&gt;Compared to the alternatives, this package has few cornerstone ideas:
1. It is meant to be simple and easy to use, taking minimal time to use …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;DataFrame Schema&lt;/strong&gt; is a small and slim package that does only one job - it checks if a given &lt;em&gt;DataFrame&lt;/em&gt; fits certain list of defined expectation. It is heavily inspired by &lt;code&gt;jsonschema&lt;/code&gt;, has only one dependency (Pydantic) and tries to be as simple and small as possible.&lt;/p&gt;
&lt;p&gt;In our experience, we found that defining a very simple, explicit, and easy-to-generate definition of what we expect to get, is crucial in our day-to-day work.&lt;/p&gt;
&lt;h2&gt;Core Ideas&lt;/h2&gt;
&lt;p&gt;Compared to the alternatives, this package has few cornerstone ideas:
1. It is meant to be simple and easy to use, taking minimal time to use and set up.
2. It puts &lt;em&gt;Dataframe&lt;/em&gt; front and center and goes from there (Read more below)
3. It will be interchangeable and will try to read  from and write to other formats (e.g. tableschema, great_expectations, etc).&lt;/p&gt;
&lt;h2&gt;DataFrame in the middle&lt;/h2&gt;
&lt;p&gt;We explicitly define "DataFrame" as a package-agnostic abstraction, as it (theoretically) could be a dataframe from &lt;em&gt;pandas,&lt;/em&gt;, or &lt;em&gt;geopandas&lt;/em&gt;, or &lt;em&gt;dask&lt;/em&gt;, or &lt;em&gt;ray&lt;/em&gt;, or &lt;em&gt;Spark&lt;/em&gt;, or anything else. In each case, we use corresponding &lt;em&gt;flavor&lt;/em&gt;, decision and assumptions made by the corresponding package. For example, if we check data derive from the database into pandas, we only care about datatype defined by &lt;em&gt;pandas&lt;/em&gt;, and let it infer datatypes as it wish. &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Right now we only support pandas, but it should be easy to add other flavors and it is on our roadmap.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Resources&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/StreetEasy/dfs"&gt;repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pypi.org/project/dataframe-schema/"&gt;pypi page&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content><category term="projects"/></entry><entry><title>MosPlus (PlutoPlus copycat)</title><link href="https://casyfill.github.io/projects/mosplus.html" rel="alternate"/><published>2020-09-27T20:58:00-04:00</published><updated>2020-09-27T20:58:00-04:00</updated><author><name>Philipp Kats</name></author><id>tag:casyfill.github.io,2020-09-27:/projects/mosplus.html</id><summary type="html">&lt;p&gt;&lt;strong&gt;MosPlus&lt;/strong&gt; is a forked version of Chris Wong's awesome &lt;a href="http://chriswhong.github.io/plutoplus/"&gt;PlutoPlus&lt;/a&gt; project, built using open footprints for Moscow (Russia).&lt;/p&gt;
&lt;p&gt;Building Footprint is a great Moscow Open Data Resource that contains a wealth of information about the city's building footprints, including address, cadaster zoning, status, registration data, and few more attributes.It contains information for the city's 145,000+ buildings, and includes 19 attributes for each one. That is (so far) a unique open data collection for Russia!&lt;/p&gt;
&lt;p&gt;Moscow Building Footprint is quite large, available &lt;strong&gt;only&lt;/strong&gt; through &lt;a href="http://api.data.mos.ru/"&gt;API&lt;/a&gt; and hard to use. That is why I forked and edited blueprints for PlutoPlus …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;MosPlus&lt;/strong&gt; is a forked version of Chris Wong's awesome &lt;a href="http://chriswhong.github.io/plutoplus/"&gt;PlutoPlus&lt;/a&gt; project, built using open footprints for Moscow (Russia).&lt;/p&gt;
&lt;p&gt;Building Footprint is a great Moscow Open Data Resource that contains a wealth of information about the city's building footprints, including address, cadaster zoning, status, registration data, and few more attributes.It contains information for the city's 145,000+ buildings, and includes 19 attributes for each one. That is (so far) a unique open data collection for Russia!&lt;/p&gt;
&lt;p&gt;Moscow Building Footprint is quite large, available &lt;strong&gt;only&lt;/strong&gt; through &lt;a href="http://api.data.mos.ru/"&gt;API&lt;/a&gt; and hard to use. That is why I forked and edited blueprints for PlutoPlus, a great tool from Chris Wong (originally for &lt;strong&gt;MapPluto&lt;/strong&gt; dataset) to to help people get access to smaller chunks of the data quickly and easily for whatever they are working on.
All data is version from &lt;em&gt;25.03.2016&lt;/em&gt; and can be exported as &lt;em&gt;geoJSON&lt;/em&gt;, &lt;em&gt;zipped shapefile&lt;/em&gt;, and &lt;em&gt;CSV&lt;/em&gt;, or can be &lt;em&gt;imported directly to your cartoDB account&lt;/em&gt;. Geometries are exported in &lt;em&gt;WGS84&lt;/em&gt; (Latitude and Longitude). For neighborhood (rayon) borders, I used &lt;a href="http://gis-lab.info/qa/moscow-atd.html"&gt;this dataset&lt;/a&gt; from Gis-Lab.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Note: FYI, There is a separate, not-exactly-opensourced yet invaluable dataset in the wild, &lt;strong&gt;data for ALL Buildings in Russia&lt;/strong&gt; including address, year, type, number of units and floors - so pretty similar to &lt;em&gt;Pluto&lt;/em&gt;, originated on &lt;em&gt;reforma-zkh&lt;/em&gt; website. It was since removed, but can be found on the internet.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img alt="screenshot" src="../static/mosplus.png"&gt;&lt;/p&gt;
&lt;h2&gt;Links&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/Casyfill/mosplus"&gt;repo&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://chriswhong.github.io/plutoplus/#"&gt;original PlutoPlus&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Data&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://data.mos.ru/opendata/1927/description?versionNumber=1&amp;amp;releaseNumber=1"&gt;Brief Description(rus)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://data.mos.ru/opendata/1927/passport?versionNumber=1&amp;amp;releaseNumber=1"&gt;Dataset Passport&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://data.mos.ru/opendata/1927/data/table?versionNumber=1&amp;amp;releaseNumber=1"&gt;Dataset&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content><category term="projects"/></entry><entry><title>Projects I worked on at Ria Novosti (2013-2015)</title><link href="https://casyfill.github.io/projects/ria_novosti_projects.html" rel="alternate"/><published>2020-05-02T00:20:00-04:00</published><updated>2020-05-02T00:20:00-04:00</updated><author><name>Philipp Kats</name></author><id>tag:casyfill.github.io,2020-05-02:/projects/ria_novosti_projects.html</id><summary type="html">&lt;p&gt;I've been working at Ria Novosti's Infographics team as a data journalist/editor from 2012 to 2015.  That was a pivotal moment for me as I "officially" left architectural career towards more data-related. It was a great experience for many reasons, including quick turn-around project management, collaboration with developers and designers, quite creative data analytics, time and opportunity to read a lot, learn a lot, and meet with many interesting people.&lt;/p&gt;
&lt;p&gt;Despite being a (one of two) official Russian news agencies, our team was quite liberal both as a collective, and also in terms of work we've done. This wasn't …&lt;/p&gt;</summary><content type="html">&lt;p&gt;I've been working at Ria Novosti's Infographics team as a data journalist/editor from 2012 to 2015.  That was a pivotal moment for me as I "officially" left architectural career towards more data-related. It was a great experience for many reasons, including quick turn-around project management, collaboration with developers and designers, quite creative data analytics, time and opportunity to read a lot, learn a lot, and meet with many interesting people.&lt;/p&gt;
&lt;p&gt;Despite being a (one of two) official Russian news agencies, our team was quite liberal both as a collective, and also in terms of work we've done. This wasn't exactly a coincidence - on one side, it is really hard to read and write news, and not to make your own opinions. But it also was a deliberate policy of the agency, as we played along with the "liberal" wing of the government (and yes, also expand our audience). Our work was, in part, a reason for the whole agency to be dismantled in 2015; Our team was "sold" to Rambler Media Group. Despite all, I think that we've done a lot, and I am very proud of our job.&lt;/p&gt;
&lt;h2&gt;Where did they go? How Russian Deputies changed their political alignment&lt;/h2&gt;
&lt;p&gt;&lt;img alt="20y-deputies" src="../static/ria/vis-gosduma-20.png"&gt;&lt;/p&gt;
&lt;p&gt;Project dedicated to 20th anniversary of Russian Duma (Parliament), tracing back the political alignment of deputies and all Political Parties ever present in there.&lt;/p&gt;
&lt;h5&gt;Team:&lt;/h5&gt;
&lt;p&gt;code: &lt;strong&gt;Michail Dunayev&lt;/strong&gt;
design: &lt;strong&gt;Valeriy Borisov&lt;/strong&gt; 
editor, analyst: &lt;strong&gt;Philipp Kats&lt;/strong&gt;
director: &lt;strong&gt;Maya Stravinskaya&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Clustering Russian Deputies&lt;/h2&gt;
&lt;p&gt;&lt;img alt="clustering" src="../static/ria/vis-gosduma-cluster.png"&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Clustering Russian Deputies&lt;/strong&gt; is a project done in 2013 at &lt;em&gt;Ria Novosti Studio of Infographics&lt;/em&gt;. Project explores how actual votes of Russian deputies are revealing underlying political structures and fractions, and how that behavior changes based on the topics of the laws. It also highlights how specific deputies voted on key issues, such as Prohibition of international adoption of Russian children, and some others.&lt;/p&gt;
&lt;p&gt;Unfortunately, project itself is removed from the site now, but here is a review (and archived screenshot)
- &lt;a href="https://ria.ru/20130708/948263330.html"&gt;description&lt;/a&gt;&lt;/p&gt;
&lt;h5&gt;Team:&lt;/h5&gt;
&lt;p&gt;code: &lt;strong&gt;Evgeny Panov&lt;/strong&gt;
design: &lt;strong&gt;Alexey Novichkov, Valeriy Borisov&lt;/strong&gt;
editor, analyst: &lt;strong&gt;Philipp Kats&lt;/strong&gt;
director: &lt;strong&gt;Maya Stravinskaya&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Deputies Tax Declarations "Calculator"&lt;/h2&gt;
&lt;p&gt;&lt;img alt="clustering" src="../static/ria/deputy_tax_declarations.jpeg"&gt;&lt;/p&gt;
&lt;p&gt;A visual representation of Deputies' tax declaration and distribution - who owns what and how much of it.
Data was collected, parsed and cleaned. Car prices were estimated using external databases&lt;/p&gt;
&lt;h2&gt;Russian Government Budget "Calculator"&lt;/h2&gt;
&lt;p&gt;&lt;img alt="clustering" src="../static/ria/budget_calculator.jpeg"&gt;&lt;/p&gt;
&lt;p&gt;Visual representation of the state budget, and a "minigame" / survey on your personal preferences. For each custom budget, a country with similar budget strategy is shown.&lt;/p&gt;
&lt;h2&gt;Pension Reform Calculator (How they got you)&lt;/h2&gt;
&lt;p&gt;&lt;img alt="clustering" src="../static/ria/pension_calculator.jpeg"&gt;&lt;/p&gt;
&lt;p&gt;A visual representation of a new "Formula" for official pension (social security payments). Shows how it results in the same amount "on paper", allowing government to pay less over time.&lt;/p&gt;
&lt;h2&gt;Singapore Urban Forum&lt;/h2&gt;
&lt;!-- ![muf](../static/ria/muf.png) --&gt;

&lt;ul&gt;
&lt;li&gt;(Information is beautiful Award)
&lt;a href="https://www.informationisbeautifulawards.com/showcase/563-global-trends-challenging-cities"&gt;link&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;4 Posters, telling the story or Moscow's urban development&lt;/p&gt;
&lt;p&gt;Design: Nadezhda Andrianova, Maria Miahaylova
Data, Content: Philipp Kats&lt;/p&gt;</content><category term="projects"/></entry><entry><title>Other Projects</title><link href="https://casyfill.github.io/other_projects.html" rel="alternate"/><published>2020-01-01T00:00:00-05:00</published><updated>2020-01-01T00:00:00-05:00</updated><author><name>Philipp Kats</name></author><id>tag:casyfill.github.io,2020-01-01:/other_projects.html</id><content type="html">&lt;p&gt;This is an index of a few of my other projects. At some point I'll move each of them into a separate page and provide more details.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/Casyfill/pyCombo"&gt;PyCombo - Python Wrapper around Combo Network partition Algorithm&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/altair-viz/pdvega"&gt;pdvega, vega driver for pandas&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://streeteasy.com/blog/data-dashboard/"&gt;StreetEasy Data Dashboard&lt;/a&gt; (Data Visualisation by Paul Buffa)&lt;/li&gt;
&lt;/ul&gt;</content><category term="projects"/></entry></feed>