2024-07-04
I was trying to do a simple transfer with rsync (via ssh) and ran into an issue. I remember seeing “error 2” and “Is your shell clean?”
How did my shell get dirty?
After a brief search I found that any kind of MOTD or printed login information would be bad for rsync. In other words, rsync expects to receive nothing when accessing a remote system and MOTD and friends would give it something.
To remove everything (this was on a fresh install of Raspbian Bookworm):
# the nuclear option
# delete MOTD
sudo rm /etc/motd
# delete the dynamic MOTD script for uname
sudo rm /etc/update-motd.d/10-uname
# silence anything else with this hidden file in $HOME
touch ~/.hushlogin
2024-01-13
When I was working on a presentation about using python for data analysis, I stumbled upon Marimo but it seemed to be a fairly new project. There have been a number of developments, so I think it’s time to check it out.
I just wanted to explore a bit of what it does and features it has (not really use it), so I made I made a very simple notebook.
To get started I wanted to use poetry:
# first make and go into new directory
poetry new .
poetry add marimo
# after installation is finished
poetry shell
Now it’s ready to start using:
Initial Thoughts
- I like that this has an easy interface (i.e. I don’t have to deal with a full UI before I get into a notebook like Jupyter. Just type
marimo edit <file>
and we’re good.
- I like that there are interactive elements.
- The hidden content shows up under the cells (bug)
What does the python file look like?
I’ll make a hello_world.py with a markdown block and code printing “hello world”.
import marimo
__generated_with = "0.1.76"
app = marimo.App()
@app.cell
def __():
import marimo as mo
return mo,
@app.cell
def __(mo):
mo.md("Hello World")
return
@app.cell
def __():
print("Hello World")
return
@app.cell
def __():
return
if __name__ == "__main__":
app.run()
That’s very simple. The nice thing about this is that it works great for git. Even just being able to edit this in a sensible manner outside of the browser interface is possible, although it doesn’t update live that way (but I have seen that this is in the roadmap).
I can now run this as an app as well with marimo run <file>
.
If I edit the notebook, the print()
statement works, but when running the app it doesn’t show up. The only thing that shows up in the app is the rendered “Hello World” from the markdown block.
Note: the print()
statement printed to the console when I ran it as an app.
What does the HTML output look like?
Part of the UI has Export to HTML
as a button, but also as a command in the command palette.
Here’s the output from the one above:
<!DOCTYPE html>
<html><head><meta charset="utf-8">
<link rel="icon" href="https://cdn.jsdelivr.net/npm/@marimo-team/frontend@0.1.76/dist/favicon.ico" crossorigin="anonymous">
<!-- Preload is necessary because we show these images when we disconnect from the server,
but at that point we cannot load these images from the server -->
<link rel="preload" href="https://cdn.jsdelivr.net/npm/@marimo-team/frontend@0.1.76/dist/assets/gradient-Mh0FAv0A.png" as="image" crossorigin="anonymous">
<link rel="preload" href="https://cdn.jsdelivr.net/npm/@marimo-team/frontend@0.1.76/dist/assets/noise-OtAaEwPD.png" as="image" crossorigin="anonymous">
<!-- Preload the fonts -->
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="theme-color" content="#000000">
<meta name="description" content="a marimo app">
<link rel="apple-touch-icon" href="https://cdn.jsdelivr.net/npm/@marimo-team/frontend@0.1.76/dist/apple-touch-icon.png" crossorigin="anonymous">
<link rel="manifest" href="https://cdn.jsdelivr.net/npm/@marimo-team/frontend@0.1.76/dist/manifest.json" crossorigin="anonymous">
<script data-marimo="true">
function __resizeIframe(obj) {
// Resize the iframe to the height of the content
obj.style.height =
obj.contentWindow.document.documentElement.scrollHeight + "px";
// Resize the iframe when the content changes
const resizeObserver = new ResizeObserver((entries) => {
obj.style.height =
obj.contentWindow.document.documentElement.scrollHeight + "px";
});
resizeObserver.observe(obj.contentWindow.document.body);
}
</script>
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin="">
<link href="https://fonts.googleapis.com/css2?family=Fira+Mono:wght@400;500;700&family=Lora&family=PT+Sans:wght@400;700&display=swap" rel="stylesheet"></head>
<body>
<div id="root"></div>
<marimo-mode data-mode="read" hidden=""></marimo-mode>
<marimo-filename hidden="">hello_world.py</marimo-filename>
<marimo-version data-version="0.1.76" hidden=""></marimo-version>
<marimo-user-config data-config="{"completion": {"activate_on_typing": true, "copilot": false}, "display": {"theme": "light", "code_editor_font_size": 14}, "formatting": {"line_length": 79}, "keymap": {"preset": "default"}, "runtime": {"auto_instantiate": true}, "save": {"autosave": "after_delay", "autosave_delay": 1000, "format_on_save": false}, "server": {"browser": "default"}}" hidden=""></marimo-user-config>
<marimo-app-config data-config="{"width": "normal", "layout_file": null}" hidden=""></marimo-app-config>
<script data-marimo="true">
window.__MARIMO_STATIC__ = {};
window.__MARIMO_STATIC__.version = "0.1.76";
window.__MARIMO_STATIC__.notebookState = {"cellIds":["0","1","2","3"],"cellData":{"0":"JTdCJTIyaWQlMjIlM0ElMjIwJTIyJTJDJTIyY29uZmlnJTIyJTNBJTdCJTIyZGlzYWJsZWQlMjIlM0FmYWxzZSUyQyUyMmhpZGVfY29kZSUyMiUzQWZhbHNlJTdEJTJDJTIybmFtZSUyMiUzQSUyMl9fJTIyJTJDJTIyY29kZSUyMiUzQSUyMmltcG9ydCUyMG1hcmltbyUyMGFzJTIwbW8lMjIlMkMlMjJlZGl0ZWQlMjIlM0FmYWxzZSUyQyUyMmxhc3RDb2RlUnVuJTIyJTNBbnVsbCUyQyUyMnNlcmlhbGl6ZWRFZGl0b3JTdGF0ZSUyMiUzQW51bGwlN0Q=","1":"JTdCJTIyaWQlMjIlM0ElMjIxJTIyJTJDJTIyY29uZmlnJTIyJTNBJTdCJTIyZGlzYWJsZWQlMjIlM0FmYWxzZSUyQyUyMmhpZGVfY29kZSUyMiUzQWZhbHNlJTdEJTJDJTIybmFtZSUyMiUzQSUyMl9fJTIyJTJDJTIyY29kZSUyMiUzQSUyMm1vLm1kKCU1QyUyMkhlbGxvJTIwV29ybGQlNUMlMjIpJTIyJTJDJTIyZWRpdGVkJTIyJTNBZmFsc2UlMkMlMjJsYXN0Q29kZVJ1biUyMiUzQW51bGwlMkMlMjJzZXJpYWxpemVkRWRpdG9yU3RhdGUlMjIlM0FudWxsJTdE","2":"JTdCJTIyaWQlMjIlM0ElMjIyJTIyJTJDJTIyY29uZmlnJTIyJTNBJTdCJTIyZGlzYWJsZWQlMjIlM0FmYWxzZSUyQyUyMmhpZGVfY29kZSUyMiUzQWZhbHNlJTdEJTJDJTIybmFtZSUyMiUzQSUyMl9fJTIyJTJDJTIyY29kZSUyMiUzQSUyMnByaW50KCU1QyUyMkhlbGxvJTIwV29ybGQlNUMlMjIpJTIyJTJDJTIyZWRpdGVkJTIyJTNBZmFsc2UlMkMlMjJsYXN0Q29kZVJ1biUyMiUzQW51bGwlMkMlMjJzZXJpYWxpemVkRWRpdG9yU3RhdGUlMjIlM0FudWxsJTdE","3":"JTdCJTIyaWQlMjIlM0ElMjIzJTIyJTJDJTIyY29uZmlnJTIyJTNBJTdCJTIyZGlzYWJsZWQlMjIlM0FmYWxzZSUyQyUyMmhpZGVfY29kZSUyMiUzQWZhbHNlJTdEJTJDJTIybmFtZSUyMiUzQSUyMl9fJTIyJTJDJTIyY29kZSUyMiUzQSUyMiUyMiUyQyUyMmVkaXRlZCUyMiUzQWZhbHNlJTJDJTIybGFzdENvZGVSdW4lMjIlM0FudWxsJTJDJTIyc2VyaWFsaXplZEVkaXRvclN0YXRlJTIyJTNBbnVsbCU3RA=="},"cellRuntime":{"0":"JTdCJTIyb3V0bGluZSUyMiUzQW51bGwlMkMlMjJvdXRwdXQlMjIlM0ElN0IlMjJjaGFubmVsJTIyJTNBJTIyb3V0cHV0JTIyJTJDJTIybWltZXR5cGUlMjIlM0ElMjJ0ZXh0JTJGcGxhaW4lMjIlMkMlMjJkYXRhJTIyJTNBJTIyJTIyJTJDJTIydGltZXN0YW1wJTIyJTNBMTcwNTEyODg1OC4zODEyMzQlN0QlMkMlMjJjb25zb2xlT3V0cHV0cyUyMiUzQSU1QiU1RCUyQyUyMnN0YXR1cyUyMiUzQSUyMmlkbGUlMjIlMkMlMjJpbnRlcnJ1cHRlZCUyMiUzQWZhbHNlJTJDJTIyZXJyb3JlZCUyMiUzQWZhbHNlJTJDJTIyc3RvcHBlZCUyMiUzQWZhbHNlJTJDJTIycnVuRWxhcHNlZFRpbWVNcyUyMiUzQTAuMzY0MDY1MTcwMjg4MDg1OTQlMkMlMjJydW5TdGFydFRpbWVzdGFtcCUyMiUzQW51bGwlN0Q=","1":"JTdCJTIyb3V0bGluZSUyMiUzQSU3QiUyMml0ZW1zJTIyJTNBJTVCJTVEJTdEJTJDJTIyb3V0cHV0JTIyJTNBJTdCJTIyY2hhbm5lbCUyMiUzQSUyMm91dHB1dCUyMiUyQyUyMm1pbWV0eXBlJTIyJTNBJTIydGV4dCUyRmh0bWwlMjIlMkMlMjJkYXRhJTIyJTNBJTIyJTNDc3BhbiUyMGNsYXNzJTNEJ21hcmtkb3duJyUzRSUzQ3NwYW4lMjBjbGFzcyUzRCdwYXJhZ3JhcGgnJTNFSGVsbG8lMjBXb3JsZCUzQyUyRnNwYW4lM0UlM0MlMkZzcGFuJTNFJTIyJTJDJTIydGltZXN0YW1wJTIyJTNBMTcwNTEyODg1OC40MDY0NjIlN0QlMkMlMjJjb25zb2xlT3V0cHV0cyUyMiUzQSU1QiU1RCUyQyUyMnN0YXR1cyUyMiUzQSUyMmlkbGUlMjIlMkMlMjJpbnRlcnJ1cHRlZCUyMiUzQWZhbHNlJTJDJTIyZXJyb3JlZCUyMiUzQWZhbHNlJTJDJTIyc3RvcHBlZCUyMiUzQWZhbHNlJTJDJTIycnVuRWxhcHNlZFRpbWVNcyUyMiUzQTI0Ljg2OTkxODgyMzI0MjE4OCUyQyUyMnJ1blN0YXJ0VGltZXN0YW1wJTIyJTNBbnVsbCU3RA==","2":"JTdCJTIyb3V0bGluZSUyMiUzQW51bGwlMkMlMjJvdXRwdXQlMjIlM0ElN0IlMjJjaGFubmVsJTIyJTNBJTIyb3V0cHV0JTIyJTJDJTIybWltZXR5cGUlMjIlM0ElMjJ0ZXh0JTJGcGxhaW4lMjIlMkMlMjJkYXRhJTIyJTNBJTIyJTIyJTJDJTIydGltZXN0YW1wJTIyJTNBMTcwNTEyODg1OC4zODA1NSU3RCUyQyUyMmNvbnNvbGVPdXRwdXRzJTIyJTNBJTVCJTdCJTIyY2hhbm5lbCUyMiUzQSUyMnN0ZG91dCUyMiUyQyUyMm1pbWV0eXBlJTIyJTNBJTIydGV4dCUyRnBsYWluJTIyJTJDJTIyZGF0YSUyMiUzQSUyMkhlbGxvJTIwV29ybGQlNUNuJTIyJTJDJTIydGltZXN0YW1wJTIyJTNBMTcwNTEyODg1OC4zOTI2OTYxJTdEJTVEJTJDJTIyc3RhdHVzJTIyJTNBJTIyaWRsZSUyMiUyQyUyMmludGVycnVwdGVkJTIyJTNBZmFsc2UlMkMlMjJlcnJvcmVkJTIyJTNBZmFsc2UlMkMlMjJzdG9wcGVkJTIyJTNBZmFsc2UlMkMlMjJydW5FbGFwc2VkVGltZU1zJTIyJTNBMC4zODIxODQ5ODIyOTk4MDQ3JTJDJTIycnVuU3RhcnRUaW1lc3RhbXAlMjIlM0FudWxsJTdE","3":"JTdCJTIyb3V0bGluZSUyMiUzQW51bGwlMkMlMjJvdXRwdXQlMjIlM0ElN0IlMjJjaGFubmVsJTIyJTNBJTIyb3V0cHV0JTIyJTJDJTIybWltZXR5cGUlMjIlM0ElMjJ0ZXh0JTJGcGxhaW4lMjIlMkMlMjJkYXRhJTIyJTNBJTIyJTIyJTJDJTIydGltZXN0YW1wJTIyJTNBMTcwNTEyODg1OC4zNzk4NDMlN0QlMkMlMjJjb25zb2xlT3V0cHV0cyUyMiUzQSU1QiU1RCUyQyUyMnN0YXR1cyUyMiUzQSUyMmlkbGUlMjIlMkMlMjJpbnRlcnJ1cHRlZCUyMiUzQWZhbHNlJTJDJTIyZXJyb3JlZCUyMiUzQWZhbHNlJTJDJTIyc3RvcHBlZCUyMiUzQWZhbHNlJTJDJTIycnVuRWxhcHNlZFRpbWVNcyUyMiUzQTAuMzM3MTIzODcwODQ5NjA5NCUyQyUyMnJ1blN0YXJ0VGltZXN0YW1wJTIyJTNBbnVsbCU3RA=="}};
window.__MARIMO_STATIC__.assetUrl = "https://cdn.jsdelivr.net/npm/@marimo-team/frontend@0.1.76/dist";
window.__MARIMO_STATIC__.files = {};
</script>
<marimo-code hidden="">
import%20marimo%0A%0A__generated_with%20%3D%20%220.1.76%22%0Aapp%20%3D%20marimo.App()%0A%0A%0A%40app.cell%0Adef%20__()%3A%0A%20%20%20%20import%20marimo%20as%20mo%0A%20%20%20%20return%20mo%2C%0A%0A%0A%40app.cell%0Adef%20__(mo)%3A%0A%20%20%20%20mo.md(%22Hello%20World%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20__()%3A%0A%20%20%20%20print(%22Hello%20World%22)%0A%20%20%20%20return%0A%0A%0A%40app.cell%0Adef%20__()%3A%0A%20%20%20%20return%0A%0A%0Aif%20__name__%20%3D%3D%20%22__main__%22%3A%0A%20%20%20%20app.run()
</marimo-code>
<script type="module" crossorigin="anonymous" src="https://cdn.jsdelivr.net/npm/@marimo-team/frontend@0.1.76/dist/assets/index-nZBIgTj0.js"></script><link rel="stylesheet" crossorigin="anonymous" href="https://cdn.jsdelivr.net/npm/@marimo-team/frontend@0.1.76/dist/assets/index-Qg8Acq0C.css"></body></html>
That’s a lot of stuff. It’s also not interactive until they get some WASM magic going. One thing that I was curious about is how I can do something in Marimo and then incorporate it into a page (like on this site). Jupyter allows me to export as markdown, and then I can go from there.
Perhaps Marimo is trying to do a lot more and keep the interface consistent. That’s nice, but it would be great if I can export as markdown.
Sharing a notebook
If you click on the share notebook button, it shows this:
Share static notebook
You can publish a static, non-interactive version of this notebook to the public web. We will create a link for you that lives on https://static.marimo.app.
That’s interesting, but that instantly think about costs involved. Any time I see things like this, I start to wonder where the money comes from and that this free tool will not be free for much longer.
Concluding thoughts
I like this a lot. It’s a little bit buggy, but the way the demos look and the interactive parts of this mean it’s going to be really great.
2024-01-13
In light of the issue from the other day where I discovered that NHK had different numbers than I did (for aftershocks) I made a recent discovery.
I happened to be browsing the main page for JMA - something that I have never done simply because I bookmark certain areas of the site that I want to use frequently. The main thing I look at is the radar for my area to see if it’s going to rain - and it usually does.
The home page for JMA lists a number of articles / news updates. I clicked on a random update about the Noto earthquakes expecting to see roughly information that I already knew - but it was different! To my suprise, they had reported more aftershocks as well. Now it was clear to me that NHK wasn’t lying and they in fact had some different information from JMA.
The question I have now is why doesn’t JMA show everything in the earthquake map?
Now, the report didn’t have a table of data and it was still not obvious where to find this information. So I went digging on the Japanese version of the website. In the English version it’s not possible to find this information (or, so it seems).
The following are some of the ways to access quake data that I have found.
Epicenter Data
The 各種デ一夕 • 資料 tab gives access to a great amount of data for different domains. Under the heading 地震・津波・火山 there is a link to 震源リスト. I think I had looked at this before but the actual list itself is buried a little.
The page itself displays a list of months like this:
震源リスト
2024年01月(クリックするとリストが開閉します)
2023年12月
2023年11月
2023年10月
...
When you click on a month, you are given the dates of the month to click on.
When you click on a day, you are taken to another page with a link like this: https://www.data.jma.go.jp/eqev/data/daily_map/20240109.html
.
That page displays a map of Japan with all of the earthquakes for that date. This is much more than what is reported on the user-friendly interactive earthquake map. It’s actually suprising.
But underneath the map, there is yet another thing to click on:
After click that you are presented with a list of epicenters. Here’s an entry:
2024 1 9 00:05 51.3 35°50.6'N 137° 8.5'E 13 0.1 岐阜県飛騨地方
You can see the 0.1
value - that’s the recorded magnitude and this is possible why it isn’t available in the user-friendly interactive earthquake map.
It wouldn’t be too difficult to gather this data from each site for each day and so far I haven’t found another way that these verbose observations are made available.
One discovery was that the map for the day has an easy link:
https://www.data.jma.go.jp/eqev/data/daily_map/20240109japan.png
There are also different parts you can click on the map as revealed in the HTML but these lead to close-ups of the map, not any more data.
The observation data is baked into the page itself inside a <pre>
HTML tag. It’s not even in a structured format - it’s almost like the output of some kind of script just pasted into the HTML.
It’s not complicated to grab that data, but, ugh.
Another issue is that this data is limited. At the time of writing it goes back to sometime in 2022, and is only posted up to two days ago.
Intensity Database
The url:
https://www.data.jma.go.jp/eqdb/data/shindo/index.html
This provides a map and form interface to see earthquakes for any range of time dating back to 1923.
Digging into the site, it seems entirely possible to use the API and get a JSON response. This is nice but I wondered if it gives all of the results even those that are very small earthquakes (i.e. M0.1).
After checking just briefly, I don’t think it does. Each dot on the map has a recorded intensity and it’s possible to filter the intensity in the form by anything more than or equal to 1. So it seems that anything quake that wasn’t strong enough to register an intensity level isn’t going to be in this data.
That’s unfortunate.
XML Data
There is XML data available from JMA. This gives an overview of what is available, but no links:
https://www.data.jma.go.jp/add/suishin/cgi-bin/catalogue/make_product_page.cgi?id=Jishin
There is another page http://xml.kishou.go.jp/xmlpull.html that has links to ATOM feeds:
- High frequency feed is updated every minute and the most recent incoming call for at least 10 minutes is posted.
- Long-term feed is updated every hour, and all incoming calls for several days are posted.
Taking a look at the high frequency feed, there are numeries entries for both earthquakes and volcanic activity. Here’s an earthquake entry:
<entry>
<title>震源・震度に関する情報</title>
<id>
https://www.data.jma.go.jp/developer/xml/data/20240110230013_0_VXSE53_010000.xml
</id>
<updated>2024-01-10T23:00:13Z</updated>
<author>
<name>気象庁</name>
</author>
<link type="application/xml" href="https://www.data.jma.go.jp/developer/xml/data/20240110230013_0_VXSE53_010000.xml"/>
<content type="text">【震源・震度情報】11日07時57分ころ、地震がありました。</content>
</entry>
Exploring one of the linked xml files it has earthquake information like this:
<Earthquake>
<OriginTime>2024-01-11T07:57:00+09:00</OriginTime>
<ArrivalTime>2024-01-11T07:57:00+09:00</ArrivalTime>
<Hypocenter>
<Area>
<Name>石川県能登地方</Name>
<Code type="震央地名">390</Code>
<jmx_eb:Coordinate description="北緯37.5度 東経137.2度 深さ 10km" datum="日本測地系">+37.5+137.2-10000/</jmx_eb:Coordinate>
</Area>
</Hypocenter>
<jmx_eb:Magnitude type="Mj" description="M3.4">3.4</jmx_eb:Magnitude>
</Earthquake>
Along with intensity observations:
<Intensity>
<Observation>
<CodeDefine>
<Type xpath="Pref/Code">地震情報/都道府県等</Type>
<Type xpath="Pref/Area/Code">地震情報/細分区域</Type>
<Type xpath="Pref/Area/City/Code">気象・地震・火山情報/市町村等</Type>
<Type xpath="Pref/Area/City/IntensityStation/Code">震度観測点</Type>
</CodeDefine>
<MaxInt>1</MaxInt>
<Pref>
<Name>石川県</Name>
<Code>17</Code>
<MaxInt>1</MaxInt>
<Area>
<Name>石川県能登</Name>
<Code>390</Code>
<MaxInt>1</MaxInt>
<City>
<Name>珠洲市</Name>
<Code>1720500</Code>
<MaxInt>1</MaxInt>
<IntensityStation>
<Name>珠洲市三崎町</Name>
<Code>1720500</Code>
<Int>1</Int>
</IntensityStation>
</City>
</Area>
</Pref>
</Observation>
</Intensity>
The long feed has a lot of information. The question is - does it contain more information than the user-friendly map interface?
I can load it up in pandas with pd.read_xml
. It takes a URL which is useful, but it’s important to specify an xpath to all of the entries:
import pandas as pd
namespaces = {'atom': 'http://www.w3.org/2005/Atom'}
# Define the XPath expression
xpath_expression = "//atom:entry"
# Read XML using pandas read_xml
df = pd.read_xml(
"https://www.data.jma.go.jp/developer/xml/feed/eqvol_l.xml",
xpath=xpath_expression,
namespaces=namespaces
)
# Filter only earthquake info:
df = df.loc[df.title == "震源・震度に関する情報"]
# Sort by date/time
df = df.sort_values("updated")
I’m accessing this on January 11, and the first entry is dated 2024-01-03T23:08:30Z
the last entry is dated 2024-01-10T23:00:13Z
and there are a total of 334 rows. That means this feed definitely does not contain verbose data. It’s actually specifically for situations that may require some kind of emergency response.
This can be found here:
https://www.jma.go.jp/bosai/map.html#11/&elem=int&contents=earthquake_map
There is a json file provided that I have used to gather the information I tried to analyze before. This is limited to information within one month prior to the current date, and there is some kind of threshold applied to the observations. I believe the threshold is related to measured intensity – if the quake isn’t strong enough to register an intensity level on land it will not be included.
One issue I have noticed with this data is that it has a problem with the GPS coordinates of the quake and the depth measurement. The GPS and depth measurement in the epicenter datasets is more accurate and it’s possible they have done some post-processing and manual checking in order to get better results.
Summary
There are a few sources:
- Epicenter Data (unfiltered, limited to start at ~2022)
- Intensity Database (all data, filtered)
- XML (only emergency)
- Interactive Map (1 month of data, filtered)
The best approach to analyze recent data may be to combine both the epicenter data (1) and the intensity database data (2). It would require identifying the eid
and then doing some merging, and etc.
To analyze most recent data the interactive map data (4) is best - but it simply doesn’t contain anything. I believe this is meant for people to look at so they can see why their house is shaking.
2024-01-05
Earlier this morning I checked NHK’s website for updates about Ishikawa and the post-quake situation. I came across one sentence in an article that immediately stood out to me since I have been wrangling the quake data from JMA in pandas for the past few days. Something didn’t add up.
Here’s the section from the article:
今月1日午後4時10分ごろに発生した能登半島地震では、石川県の志賀町で震度7の激しい揺れを観測したほか、震度6強の揺れを七尾市と輪島市、珠洲市、穴水町で観測しました。
また、新潟県と石川県、富山県、福井県、長野県、岐阜県で震度6弱から5弱を観測しました。
能登地方やその周辺を震源とする地震はその後も相次ぎ、震度1以上の揺れを観測した地震は5日午前4時までに786回にのぼっています。
気象庁は、揺れの強かった地域では、家屋の倒壊や土砂災害などの危険性が高まっているため、1日の地震から1週間ほどは最大震度7程度の揺れに注意するよう呼びかけています。
A rough English translation of the sentence in question: “Earthquakes with an epicenter in and around the Noto region have continued after that, and earthquakes with a seismic intensity of 1 or more have occured 786 times up to 4 a.m. on the 5th.”
Yesterday morning when I had checked according to filtered data from the JMA the number was 247 - that was about 8:30am, January 4th.
The number is so different that it made me wonder if I had something wrong.
There are a few possibilities:
- Something about my method was flawed.
- JMA is providing NHK with data that is not publicly available.
- NHK has made an error in how they analyzed data.
The first thing that I checked was the actual number of entries in the dataset, starting from January 1st at 4:09pm until now (current time is 8:56pm, January 5th). It’s reasonable that the total number of entries would now be more than 768. Keep in mind that there are a number of other things in that data - earthquakes outside of Japan, outside of Noto and the surrounding area, “information” (which is not information about an epicenter), etc. Those are things that I filtered out before because they are not relevant to analyzing quake data from Noto peninsula.
Here’s the code that I ran:
import pandas as pd
import datetime
# the json url
quakes_json_url = "https://www.jma.go.jp/bosai/quake/data/list.json"
# load it into a dataframe
quakes = pd.read_json(quakes_json_url)
# convert to datetime but first will drop the offset
quakes["at"] = quakes["at"].apply(lambda x: x.replace("+09:00", "").replace("T", " "))
quakes["at"] = pd.to_datetime(quakes["at"])
# change the 'at' column to date_time
quakes = quakes.rename(columns={"at": "date_time"})
# make a mask to have only those entries with a date_time starting after 4:09pm, January 1st
dt_mask = quakes.date_time > datetime.datetime.fromisoformat("2024-01-01 16:09")
# Apply the mask and get the number of entries
quakes.loc[dt_mask].shape[0]
Result:
Note: the data available in that JSON file is limited to one month prior to the current date - so if you are trying this in the distant future it won’t work.
Note: I didn’t account for duplicated data, and there are some where the eid
is the same.
I don’t even need to dig any further. There just isn’t enough data to support the claim from NHK.
Well, just to show in more detail what kind of entries there are – there are several kinds of “titles” (column = ttl
) in the dataset. Here are the value counts for each with the same date_time
mask applied from above:
quakes.loc[dt_mask].ttl.value_counts()
Result:
震源・震度情報 537
震度速報 105
震源に関する情報 41
顕著な地震の震源要素更新のお知らせ 4
I see a large number of actual epicenter entries (537). Let’s see where those might be. The en_anm
column is English area name. Let’s see the value counts for those, but just the epicenter:
quakes.loc[dt_mask & (quakes.ttl == "震源・震度情報")].en_anm.value_counts()
Result:
Noto, Ishikawa Prefecture 318
Off the Coast of Noto Peninsula 145
Adjacent Sado 35
Off the Coast of Joetsu and Chuetsu, Niigata Prefecture 18
Off the west Coast of Ishikawa Prefecture 6
Toyama Bay 4
Northwestern Chiba Prefecture 1
Adjacent Sea of Amami-Oshima Island 1
Off the Coast of Iwate Prefecture 1
Northern Wakayama Prefecture 1
Adjacent Sea of Ishigakijima Island 1
Northern Ibaraki Prefecture 1
Adjacent Sea of Tokara Islands 1
Northern Miyagi Prefecture 1
Eastern Region · Fuji Five Lakes, Yamanashi Prefecture 1
Off the east Coast of Osumi Peninsula 1
Iyonada Sea 1
Just to double check another thing - each area has a code. I had been using 390
since that seems to have been the code for Noto. Let’s check the counts for those codes according to the English name, with the same filter from the above:
quakes.loc[dt_mask & (quakes.ttl == "震源・震度情報")].groupby(["acd", "en_anm"]).en_anm.count()
Result:
acd en_anm
220 Northern Miyagi Prefecture 1
286 Off the Coast of Iwate Prefecture 1
300 Northern Ibaraki Prefecture 1
341 Northwestern Chiba Prefecture 1
379 Off the Coast of Joetsu and Chuetsu, Niigata Prefecture 18
390 Noto, Ishikawa Prefecture 318
412 Eastern Region · Fuji Five Lakes, Yamanashi Prefecture 1
494 Off the west Coast of Ishikawa Prefecture 6
495 Off the Coast of Noto Peninsula 145
497 Toyama Bay 4
498 Adjacent Sado 35
550 Northern Wakayama Prefecture 1
680 Iyonada Sea 1
793 Adjacent Sea of Amami-Oshima Island 1
798 Adjacent Sea of Tokara Islands 1
820 Off the east Coast of Osumi Peninsula 1
854 Adjacent Sea of Ishigakijima Island 1
I’m a little rusty on my Japanese geography, but I do know that at least 6 of these entries are relevant:
acd en_anm
390 Noto, Ishikawa Prefecture 318
495 Off the Coast of Noto Peninsula 145
379 Off the Coast of Joetsu and Chuetsu, Niigata Prefecture 18
498 Adjacent Sado 35
494 Off the west Coast of Ishikawa Prefecture 6
497 Toyama Bay 4
Adding those together, I get a total of 536. The counts from other areas are so small they make little difference. Remember, NHK’s number was 786 – and that was almost 16 hours ago.
Wrapping up
Now with this information I do understand that my method of just checking for quakes from 390
was not adequate since there are other areas that are related and likely have an impact (as far as an aftershock is concerned). I believe part of my misunderstanding was due to how those codes work- and this bit of work from above helps to make that more clear.
In a follow up at some point it would be good to actually use coordinates as a window. Another interesting angle would be to see how many earthquakes had a registered intensity on Noto peninsula.
However, the number here is less than what NHK reported. It seems there was either an error on NHK’s part, or JMA shares non-public data with them - but it can’t be an intentional error… the government run media would never lie to us.
2024-01-04
A quick post. It’s almost 8:30am, January 4th. There have been 247 quakes since the big one.
Let’s see what the magnitude is like as a boxplot (for every hour):
iquakes.boxplot(
by=["date", "hour"],
column=["mag"],
rot=90,
figsize=(20,20),
ylabel="Magnitude",
)