Show HN: Sciagraph, performance+memory profiler for production Python batch jobs
3 points • 0 comments
From 2/24/2016, 2:14:11 PM till now, @itamarst has achieved 5108 Karma Points with the contribution count of 1408.
Recent @itamarst Activity
Show HN: Sciagraph, performance+memory profiler for production Python batch jobs
3 points • 0 comments
Why new Macs break your Docker build, and how to fix it
5 points • 0 comments
Pandas vectorization: faster code, slower code, bloated memory
1 points • 0 comments
Making pip installs a little less slow
13 points • 0 comments
CPUs, cloud VMs, and noisy neighbors: the limits of parallelism
2 points • 1 comments
Faster, more memory-efficient Python JSON parsing with msgspec
5 points • 0 comments
When Python can’t thread: a deep-dive into the GIL’s impact
4 points • 0 comments
I'm on vacation, will do Monday.
Fil can dump memory profiling reports on failed allocations; as other commenter said, Linux by default happily just gives you memory even if doesn't have any, so failed allocation implies giant allocation.
BTW I am starting a Slack for devs working on profilers, would be great to have you all join, I'd love to hear more about the ELF patching technique (Fil uses LD_PRELOAD and macOS equivalent).
Cool! Fil also tracks every allocation too, although it doesn't dump that at the moment, just the resulting report.
I'm excited to see more profiling tools for Python!
This sounds like it does peak memory, which is critical for batch jobs, since that's the bottleneck. Memory is fundamentally different than performance in that it's a limited resource, instead of cumulative cost; making any part of the program faster almost always helps speed up the program (at least a little, or at least reduces CPU load), but optimizing non-peak memory has no impact. You have to be able to identify the peak in order to reduce memory usage.
If you want peak memory profiling for Python that also runs on macOS, check out https://pythonspeed.com/fil/ (ARM support has some issues, but once I unpack my new Mac Mini I plan to fix it.)
Ways memray is better than Fil:
- Native callstacks.
- More kinds of reports, and ability to do custom post-processing of data.
- Much lower overhead (but not always, see reply).
- Subprocess support.
Fil I suspect has better flamegraphs: https://pythonspeed.com/articles/a-better-flamegraph/
And if you're running Python batch jobs, and want both peak memory and performance profiling in production, check out Sciagraph: https://pythonspeed.com/sciagraph/
(You can probably cobble together something like Sciagraph with py-spy + memray, but you won't e.g. get timeline reports designed with batch jobs in mind.)
Speeding up software with faster hardware: tradeoffs and alternatives
1 points • 0 comments
Bashing the Bash – Replacing Shell Scripts with Python (2017)
92 points • 132 comments
Please stop writing shell scripts
38 points • 59 comments
Processing large JSON files in Python without running out of memory
2 points • 0 comments
Docker builds in CircleCI: go faster, and support newer Linux versions
2 points • 0 comments
Faster Python calculations with Numba: 2 lines of code, 13× speed-up
3 points • 0 comments
Finding leaked secrets in your Docker image with a scanner
1 points • 0 comments
The fastest way to read a CSV in Pandas
5 points • 0 comments
site design / logo © 2022 Box Piper