2 Answers
C++ is a GPL (General Purpose Language) whereas R is a statistical (and graphical) language. A (good?) analogy would be the difference between a pick up truck a fork lift, they both have wheels and can carry things but have very different purposes. You could use your forklift to transport a load five miles down the road, but why would you, you have a pick up truck for that. Also you can do a lot more with the pick up truck but, importantly, it cannot lift things.
C++, a building with the outside finished but not the inside, a bunch of of wire of different guages, a bunch of junction boxes, a bunch of outlets for both 120 and 240, and a bunch of breakers and switches. R, an already wired building with outlets you can plug stuff into. The finished building is vary useful for whatever purpose the builder wired it for as R is for statistics on datasets that aren't too big, but can't be wired to do many different things easily the way the unfinished building could be.
2 Answers
C++ is a GPL (General Purpose Language) whereas R is a statistical (and graphical) language. A (good?) analogy would be the difference between a pick up truck a fork lift, they both have wheels and can carry things but have very different purposes. You could use your forklift to transport a load five miles down the road, but why would you, you have a pick up truck for that. Also you can do a lot more with the pick up truck but, importantly, it cannot lift things.
C++ (programming language): Why do people who come from a physics background tend to use C++?
13 Answers
I would wager that nearly every single physicist with a Phd has familiarity with at least one slow, but easy to use language, such as python/numpy or matlab. Those who need to run big codes on large machines (which is many many people) must know C/C++ and sometimes need to know Fortran or even CUDA/openCL. It is hard to overemphasize the importance of computation to modern science, and C/C++/Fortran are the engines that drive the biggest simulations.
Akshat Mahajan,
• Upvoted by George Hagstrom, I have a Phd in physics from the University of Texas at Austin, the topic of my thesis was plasma p… and Alex Sergeev, PhD in Physics Views
Akshat has 210+ answers in Physics
C++ is optimizable. It's close enough to the hardware to allow for remarkably good optimization, and it's explicit memory management means you can have some seriously fast speedups. It's not as low-level as C while not being so high-level that it loses track of what it sets out to achieve.
C++ is popular. Many systems survive on legacy code, built well before Java and Python became all the rage. The first physicists who built those systems used C++, taught it to their pupils, who in turn taught it to there pupils, and so on. When you're a physicist and you're first learning to code, you're much more likely to adopt a language your lab has ample support for.
C++ is everywhere. Many data collection systems just need something slightly higher than assembler. Physicists aren't just involved in coding - they're also busy making the instruments that eventually fly out to space or sit in underground labs. That means you need microprocessor systems, and you need to be able to code those microprocessor systems, and very often those microprocessor systems lend themselves to easy C++ use - Arduinos, for instance, use C++ as their default language. Why stay aloof when your experiment depends on something like this?
That being said, it's not as common as you're making it out to be. I myself know no C++, and most of the theoretical physicists I know are far more comfortable using something like Mathematica, Sage or Python. Experimental physicists tend to be all over the spectrum, using C++ and anything at hand to really get their systems up and running, while those who work with large datasets eventually end up using MATLAB, IDL or C++. A handful of physicists know Java, particularly if they're doing outreach or making mobile applications (really, you'd be amazed how useful a push notification can be if you've got sensitive lab equipment running). Using shell and Linux processes is not uncommon. Particle physicists, in particular, are notorious for using ROOT, a plotting platform that's Linux-based and a pain to use if you don't know what you're doing.
Physicists, like software engineers, are all over the coding and software spectrum. Some are really good at implementing multithreaded processes; others at numerical, high-speed algorithms; others at impressive visualisations; others at collecting data. Sampling bias is probably at fault in giving you this mistaken impression of the C++-only physicist.
A big reason for why C++ and FORTRAN is still used is because of legacy software. Most of the programs that have been designed for research have their origins in the 1980s/1990s, and the optimal language for developing any program back then were the two mentioned. Back then, how to manage the memory usage in a program was a real concern, and PCs simply weren't powerful enough to bulldoze the calculations.
At least when working on biophysics related calculations, a lot of the data output is simply too large to deal with using higher level languages like python or MATLAB.
One thing you'll notice with a lot of Physics programs is that they still use a lot of C conventions - ** and & notations, malloc rather than new, and printf rather than cout. I remember personally using C++ as a slapon to originally C-programs simply because I found a lot of the C++ terminology turgid and inconvenient, contrary to their original purpose.
Whilst I'm not experienced in this (I wish I was!) C/C++ is also closely linked to systems programming and unix in general, and this tips the edge for physicists to use.
Having said that, I do like Python and R too. Their libraries make things an order of magnitude easier. But for any hardcore number crunching, the C family is still the most reliable one, along with FORTRAN.
At first, I was very, very skeptical about MATLAB. The main reason is that I mistrust black-box algorithms. With C++ I can tell which exact operations led to the results I'm trying to publish. Eventually I started to use it, and it works really well for 95% of the problems. Since it is very tied to double precision math, it cannot solve problems that require extended precision (long double) types. I encounter a couple of these problems a year. In C++ you can also hack your numeric algorithms to use specific information about your problem. This, combined with the native nature of the language and the availability of very good compilers (The Intel compiler is particularly good), makes some C++ numeric algorithms run like 10x faster than their MATLAB counterparts. This is critical for algorithms that take a week to run in C++.
C++ is a handy tool. Once you've developed some libraries (such as wrappers for GNUPLOT and to communicate with instruments through GPIB, serial and USBTMC), you can hack out a code for an experiment literally in the time it takes for LabView to start. Furthermore, proprietary software has licensing issues (you cannot just install copies into all of your computers, so usually a machine ends up tied to an experiment), it's restricted to certain platforms, and likes to close without saving changes when the network connection fails and it loses communication with the licensing server. This is terrible for long experiments.
So for these reasons I use C++ quite often and I support continuing to teach it to physics graduates.
If you look at old-school physicists (think 1970's) they all used FORTRAN.
I think it's that C++ gives you a feeling of being close to the machine, and dealing with bits and bytes directly. Physics types like machines, of course. It's more technically demanding than many other languages, which for a person who likes physics isn't a problem. If you're very technically-minded, other languages can somehow feel too "soft" and like they're pandering to you at the expense of efficiency.
If you go a step further and work with assembly language, you're actually doing what almost feels similar to some of the mathematics you'll probably have encountered in a physics degree, e.g. in solid state physics.
I don't think it's necessarily relevant to the discipline though, except that you can process a lot of data quickly with it.
To quote George Hagstrom "nearly every single physicist with a Phd has familiarity with at least one slow, but easy to use language, such as python/numpy or matlab. Those who need to run big codes on large machines (which is many many people) must know C/C++ and sometimes need to know Fortran or even CUDA/openCL."
With Julia you can have it all in a single language. Explained by Professor Alan Edelman of MIT in under 4 minutes why Julia is great, such as for physics and math (but I would argue most things):
[Note: he's listed as one of the designers, I guess he was Jeff Bezanson PhD supervisor (another of the 5 original designers (now in hundreds?) or shouldn't I say the main designer/devoloper?), his thesis is on Julia language/why its great, a good read..]
Views
1) It's general purpose. Scientific computing implies not only number crunching, but user interface, pre-processing, and post-processing tasks. You have tons of libraries available for C++ (and C) to do these tasks (numerical and others), thus your program can be linked easily.
2) Faster. Compiled C++ (C and Fortran, too) code is closer to what the machine understands (e.g. int, double must be declared) and thus can be optimized by the compiler. Many Matlab/Python implementations (SciPy, NumPy) wrap around C/C++/Fortran libraries (BLAS, LAPACK). Add also parallelization, MPI and OpenMP only support and interact with C/C++ and Fortran programs.
3) Production quality software. With the latest C++11 standard and powerful template libraries (STL), I found programming quality code in C++ much easier. STL containers (vector, maps, hash known as unordered_map), iterators, I/O streams and strings handling, algorithms, smart pointers, try/catch, etc. take care of potential user introduced dangers and things like memory leaks, segmentation faults, exceptions, dynamic reallocations. Fortran is great for number crunching, but lacks many of these C++ standard features (many available in boost pre-C++11).
Thus, you don't have to reinvent the wheel as in Fortran.
4) Maintainability: C++ objects, lambdas and Templates make code easier to maintain. Objects adds modularity to the code, whereas using lambdas and templates can reduce the number of lines of codes dramatically.
5) Matlab and Python are easy to learn and could possibly satisfy your needs. But for many applications involving number crunching you need something more powerful (mostly C/C++/Fortran). Like I heard before, Python/Matlab are like a bicycle (easy to learn, takes you to point A to point B with little learning curve), whereas C++ is like a car (steeper learning curve but takes you farther and is faster). I think Fortran is the old vintage powerful car that works and nobody wants to touch, but only maintain as it would be more expensive to redesign it (ain't broke, ain't fix).
I think that kind of task appeals to physics types, and they do relatively little human interface type programming or data mining, for which C++ is less favoured.
Views
For example take adding two 4X4 maticies a and b. you can create a matrix class and overload the + operator and simply do "a + b" where is other langauges you probably have a function to do the adding by taking in the two objects
No comments:
Post a Comment