返回介绍

Code Optimization

发布于 2025-02-25 23:43:59 字数 3351 浏览 0 评论 0 收藏 0

There is a traditional sequence for writing code, and it goes like this:

  1. Make it run
  2. Make it right (testing)
  3. Make it fast (optimization)

Making it fast is the last step, and you should only optimize when it is necessary. Also, it is good to know when a program is “fast enough” for your needs. Optimization has a price:

  1. Cost in programmer time
  2. Optimized code is often more complex
  3. Optimized code is oftne less generic

However, having fast code is often necessary for statistical computing, so we will spend some time learning how to make code run faster. To do so, we need to understand why our code is slow: Code can be slow because of differnet resource limitations:

  1. CPU-bound - CPU is working flat out
  2. Memory-bound - Out of RAM - swapping to hard disk
  3. IO-bound - Lots of data transfer to and from hard disk
  4. Network-bound - CPU is waiting for data to come over network or from memory (“starvation”)

Different bottlenekcs may require different appraoches. However, theere is a natural order to making code fast

  1. Cheat
    • Use a better machine (e.g. if RAM is limititg is - buy more RAM)
    • Solve a simpler problem (e.g. will a subsample of the data suffice?)
    • Solve a diffrent problem (perhaps solving a toy problem will suffice for your JASA paper? If your method is so useful, maybe someone else will optimize it for you)
  2. Find out what is slowing down the code (profiling)
    • Using timeit
    • Using time
    • Usign cProfile
    • Using line_profiler
    • Using memory_profiler
  3. Use better algorithms and data structures
  4. Using compiled code written in another language
    • Calling code written in C/C++
      • Using bitey
      • Using ctypes
      • Using cython
    • Calling code written in Fotran
      • Using f2py
    • Calling code written in Julia
      • Usign pyjulia
  5. Converting Python code to compiled code
    • Using numexpr
    • Using numba
    • Using cython
  6. Parallel programs
    • Ahmdahl and Gustafsson’s laws
    • Embarassinlgy parallel problems
    • Problems requiring communiccation and syncrhonization
      • Race conditions
      • Deadlock
    • Task granularity
    • Parallel programming idioms
  7. Execute in parallel
    • On multi-core machines
    • On multiple machines
      • Using IPython
      • Using MPI4py
      • Using Hadoop/SPARK
    • On GPUs

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文