原文:https://togototo.wordpress.com/2013/07/23/benchmarking-level-generation-go-rust-haskell-and-d/
该文章的测试时间是2013,下面的测试数据不代表当前各语言的最新版本
其他文章参考:
系统级编程语言性能大PK的笑话-Go语言
http://my.oschina.net/chai2010/blog/150379
I’m working on a random level generator for a game that, although written in C++, is modular such that the level generator can be written in something more high-level. As C++ isn’t always the most fun and productive language to program in, I set out to benchmark some potential alternatives, using a simple Roguelike level generation benchmark that relies heavily on iteration and conditionals and hence roughly mimics the actual generator. The code used is available here: https://github.com/logicchains/levgen-benchmarks. Any suggestions for improving the code used would be most welcome. The majority of the running time is spent on this (using the Haskell version as I think it’s the easiest to read):
roomHitRoom Room {rPos=(x,y), rw=w, rh=h} Room {rPos=(x2, y2), rw=w2, rh=h2}
| (x2 + w2 +1 ) < x || x2 > (x+w+1 ) = False
| (y2 + h2 +1 ) < y || y2 > (y+h+1 ) = False
| otherwise = True
Checking a newly generated room against a list of previous rooms to see if they collide or not, discarding it if it does (it’s a brute-force level generation technique; the actual engine is a bit more sophisticated, but still relies on the same principle). Much of the rest of the time is spent on random number generation, so take these benchmarks with a grain of salt, as to a degree they’re as much a benchmark of the respective languages’ random number generators as they are of their general speed (i.e. these benchmarks are relevant to my goal, not necessarily yours. I enclose the generalisations made later in this post within the context of that statement).
All implementations now use the XorShift PRNG, with Rust, Go, C and D using the exact same code (a 32bit xorshift function), Scala using a 128-bit xorshift function, and Haskell using a 128-bit xorshift library. It is hence now less unreasonable to make comparisons between the languages, compilers and implementations.
The results are as follows:
Compiler | Speed(s) | %Fastest | SLOC |
LDC | 0.256 | 100% | 107 |
Clang | 0.279 | 92% | 140 |
FCC* | 0.283 | 90% | 111 |
Rustc | 0.303 | 85% | 103 |
6g | 0.323 | 79% | 131 |
G++ | 0.330 | 78% | 127 |
Scala | 0.344 | 74% | 79 |
GCC | 0.347 | 74% | 140 |
LLVM-GHC | 0.428 | 60% | 78 |
GHC | 0.546 | 47% | 78 |
DMD | 0.567 | 45% | 136 |
GCCGO | 0.598 | 43% | 131 |
*A Redditor submitted a version in a language they’re working on called Neat, currently built on top of the LLVM and inspired by D; the compiler is here https://github.com/FeepingCreature/fcc. I was impressed by how a new language can take advantage of the LLVM like that to achieve the same level of performance as much maturer languages.
**Edit: Improvements to the D code make it now the fastest implementation. Most of the speedup came from rearranging the checkColl function to allow for better LLVM optimisation, but I’m not personally familiar enough with the language or the LLVM to explain the optimisations in question.
The LLVM version used by LDC, Rust and Clang is 3.2. GCC and GCCGo used GCC version 4.7.3, while GDC used GCC version 4.6.4. Rust was version “0.8-pre (9da42dc 2013-07-17 04:16:42 -0700)”, GHC was version 7.6.2, DMD was 2.036 and 6g (Go) was 1.1.1. They were all run with 03 where available, –opt-level=3 for Rust, -release for DMD, and -funbox-strict-fields for Haskell. D now also runs with -inline and -noboundscheck.
D was the fastest non-C language tested, with Rust following close behind. It’s worth noting that for this task at least both D and Rust “beat C” at the same code, when comparing their results to the GCC C executable’s. The LLVM appears to do a much better job of optimising this kind of PRNG than GCC, showing how sometimes compiler effectiveness at optimising a given piece of code can be a bigger factor than language choice in determining speed. I will be excited to run these benchmarks again in a year’s time and see how Clang, LLVM D and LLVM Rust compare then.
Scala really surprised me; I didn’t realise a JVM language could run so fast for this kind of code. It even matched the speed of GCC C, putting a lie to the statement that languages running on a VM must necessarily be slower. Many thanks to markehammons for submitting the Scala code! Although I haven’t yet learned Scala, the code is surprisingly easy to read, and isn’t full of ugly optimisation code like I’d have expected a fast JVM program to be. Definitely a language worth considering even for speed-critical tasks.
Rust’s speed was impressive, for such a new language. Its flexibility with memory (optional GC) made it interesting to write in, however Rust’s flexibility (at least currently) comes at a slight price, as its syntax takes a bit of getting used to; to pass a heap-allocated vector by reference I need to use myFunc(& mut myVector) (even though it’s already a mutable vector), and the function receiving it needs myAlias:&mut~[myVector] in its type signature, like fn myFunc(myAlias:&mut ~[myVector]) {..}. Compared even to C ( void myFunc(struct MyType myAlias[arrayLength]){..} ), the Rust version looks a bit byzantine. For those who haven’t been following Rust, there are around seven different kinds of pointers: @ (garbage collected heap allocation), ~ (uniquely owned heap allocation), & (borrowed pointer to heap/stack allocation), * (raw C pointer, only useable in unsafe code), and mutable versions of the first three of those (actually, I’m not sure if there’s a mut & pointer, so maybe just six pointers in total). Note also that a pointer to a mutable value behaves differently than a pointer to a non-mutable value.
**Quoting kibwen on YCombinator:
“We’re moving the garbage-collected pointers (@) out into a library, so by default we’ll have only a single pointer, ~ (and there’s no mutable version of this). We’ll still have & and &mut, but those aren’t pointers, they’re references (it’s our own fault for calling them “borrowed pointers” in our documentation). So depending on how you count, that’s either one or three pointer types, which I’m happy with. The “unsafe” pointers exist only for C interop and have no bearing on normal code, so it’s inaccurate to count them.” So there’s actually really only three relevant kinds of pointer.
I also benchmarked a version of the Rust code using stack-allocated vectors, but there was no noticeable speed difference. I did however learn that stack-allocating vectors in Rust is currently rather verbose, as since uninitialised values aren’t allowed, I had to create an instance of the object I wanted an array of that had all its values set to zero and fill the vector with that upon creation. Hopefully in future Rust will adopt something like Go’s automatic initialisation of all values to zero, or at least have this as an option (or document it better, if it already is an option). Currently it looks like this:
**Apparently it is already possible, as described here: https://news.ycombinator.com/item?id=6094819. Hopefully it’ll be documented in the Rust tutorial next time it’s updated.**
let emptyr = Room {X:0,Y:0,W:0,H:0,N:0};
let emptyt = Tile{X:0,Y:0,T:0};
let emptyl = Lev{TS : [emptyt,..2500] , RS : [emptyr,..100] };
let mut ls : [Lev, ..100] = [emptyl,..100];
Which is a lot of unnecessary code, and would quickly become cumbersome in a large project that made heavy use of stack-allocated arrays (although there doesn’t seem to be a reason for doing so, as in this test at least they performed no faster than heap-allocated ones).
Go was impressive, performing well for a relatively new language albeit still falling behind D. The default PRNG was quite lacking in speed, so changing it to use XorShift resulted in a massive speedup. Interestingly, while GCCGo was faster with the default PRNG, the standard 6g runtime was faster with the bitshifted XorShift PRNG. The freedom from semicolons is a nice experience, and in terms of my subjective experience it was the most enjoyable language to write imperative code in. I’m quite looking forward to the development of LLVM Go.
Although it took me quite a while to optimise, I’m happy with Haskell’s performance, especially since it has to carry a vector of [initially] ten million random Ints around (random numbers can’t just be generated in the middle of functions, as that breaks purity). Writing recursively was a nice change, and the naive Haskell version was the most concise implementation of the languages tested (it grew significantly when I switched from lists to vectors and to the faster random number generator, as I couldn’t get pattern matching to work on vectors so had to stick [unsafe]Head and Tail everywhere).
**Updated the Haskell version to use MWC generator instead of Mersenne, and updated again to use more idiomatic code, resulting in a speedup to 0.770s. Updated yet again to use XorShift, now got the running time down to 0.546s. Running with -fllvm speeds it up to 0.428s.
Any more improvements to the Haskell code would be most welcome. Stay tuned for parallel level generation benchmarks! (Eventually..)
Finally, the moral of the story: there’s no such thing as a slow language, only an insufficiently optimising compiler.
Second moral of the story: if cryptographic-level randomness isn’t needed, then XorShift is the best PRNG algorithm available in terms of speed.
相关推荐
Use this in-depth guide to correctly design benchmarks, measure key performance metrics of .NET applications, and analyze results. This book presents dozens... and low-level features of modern hardware).
Until now, however, little reliable, practical information has been available to IT professionals who are responsible for running these systems efficiently and cost-effectively. Systems Performance: ...
用以企业随机前沿、投入产出效率分析的最新工具书,实践性强,理论与操作并重。
The Way to Go,: A Thorough Introduction to the Go Programming Language 英文书籍,已Cross the wall,从Google获得书中源代码,分享一下。喜欢请购买正版。 目录如下: Contents Preface......................
Database Benchmarking and Stress Testing introduces you to database benchmarking using industry-standard test suites such as the TCP series of benchmarks, which are the same benchmarks that vendors ...
go程序设计语言 Contents Preface................................................................................................................................. xix PART 1—WHY LEARN GO—GETTING ...
Dhrystone Benchmarking for ARM Cortex Processors ARM处理器在Dhrystone软件下的基准测试, 测试其他嵌入式CPU可以按照该标准测试计算
安装如果未安装 Gradle: git clone https://github.com/danielmitterdorfer/benchmarking-talk.gitcd benchmarking-talk./gradlew shadowjava -jar build/libs/benchmarking-talk-0.1.0-all.jar或者,如果安装了 ...
Big Data Benchmarks, Performance Optimization, and Emerging Hardware: 6th Workshop, BPOE 2015, Kohala, HI, USA, August 31 - September 4, 2015. Revised ... Papers (Lecture Notes in Computer Science) ...
开源项目-gilbertchen-benchmarking.zip,A performance comparison of Duplicacy, restic, Attic, and duplicity
Python_BenchMarking_Template描述: 主要使用utils功能对代码进行基准测试,以查看运行该代码需要花费多少时间。编码标准: 项目使用干净代码和模块化的概念。前提条件None内容 作者信息姓名: SUGAANTH MOHAN 电子...
Prototyping, Benchmarking, and Testing . Creating Useful Indexes . Monitoring Query Performance . Concurrency and Consistency Tradeoffs . Resolving Blocking Problems . Resolving Deadlock ...
Quantitative Models for Performance Evaluation and Benchmarking Data Envelopment Analysis with Spreadsheets 数据包络分析:绩效评估与标杆
postgresql 高性能pdf Chapter 1: PostgreSQL Versions ...Chapter 3: Database Hardware Benchmarking Chapter 4: Disk Setup ..... Chapter 14: Scaling with Replication Chapter 15: Partitioning Data
High Performance MySQL is the ...The book also includes chapters on benchmarking, profiling, backups, security, and tools and techniques to help you measure, monitor, and manage your MySQL installations.
About the Technical Reviewer and Contributing Author.................xxi Chapter1 Apache and the Internet..............................................1 Apache: The Anatomy of a Web Server...............
Title: Xcode 6 Start to Finish: iOS and OS X Development, 2nd Edition Author: Fritz Anderson Length: 656 pages Edition: 2 Language: English Publisher: Addison-Wesley Professional Publication Date: ...
本书详细介绍在.NET中的一些性能测试原理以及常用的测试工具,让你的.NET程序性能爆表
基准图神经网络更新2020年11月2日基于DGL 0.4.2的项目请参阅环境yml文件( , )中定义的相关依赖项。 数值实验表明,与DGL 0.5.2相比,DGL 0.4.2的训练时间更快。 对于与DGL 0.5.2和相关依赖项兼容的项目版本,请...