How is science used in software development

Software development in science


1 Software Development in Science An Introduction Sandra Schröder Department of Computer Science Working Area Scientific Computing University of Hamburg Seminar: Software Development in Science 1/47

2 Outline Outline 1 Introduction and Motivation 2 Scientific Software Development 3 Hardware What is Scientific Software Development? Requirements Software Design 4 Computer Arithmetic Machine Numbers From the place value system to floating point representation Rounding and rounding errors 5 Numerics Error observation mechanisms 6 Summary 7 Discussion and outlook 2/47

3 Introduction and motivation Initial situation Initial situation Basis: Scientific question Modeling in a mathematical model with the help of equations Interested in the (exact) solution of the equations Support from the computer Interaction between mathematics, computer science and natural sciences (MIN) 3/47

4 Introduction and motivation MIN - Mathematics, computer science, natural sciences Definition of MIN Figure: interaction mathematics Effective structuring and logical justification of knowledge Computer science Representation, storage, transmission and processing of information Natural science Exploration of animate and inanimate nature: description, recording, explanation (wikipedia .de) 4/47

5 Introduction and motivation MIN - mathematics, computer science, natural sciences Modeling scheme Figure: Sampling 5/47

6 Scientific Software Scientific Software - Overview Scientific Software What, Why, How? Requirements for scientific software software design: planning, writing, testing, debugging, tuning, documenting 6/47

7 Scientific software The dilemma of scientific software development Non-scientific software is easy to implement An average programmer would not understand scientific interrelationships. Cooperation between programmers and scientists Sounds simple, but ... 7/47

8 Scientific software communication Figure: Communication - [Quelle1] 8/47

9 Scientific software How scientists develop their own software Figure: Software development in a scientific way - [Quelle1] 9/47

10 Scientific Software What is Scientific Software? Scientific vs. non-scientific Scientific: Calculation of complicated equations Support with data evaluation, measurement, control and regulation technology Simulation Non-scientific: Account management, hotel booking, cinema manager 10/47

11 Scientific software Requirements for scientific software Requirements for scientific software Reliability Correctness Robustness Accuracy Portability Efficiency Runtime - Efficiency Memory - Efficiency 11/47

12 Scientific software Requirements for scientific software What is reliability? Degree of probability with which the program fulfills its function Correctness If correct output data are generated from input data, then the program is correct. Robustness Degree of a program with which it detects errors, reacts in an understandable manner for the user, but still maintains its functionality Correspondence between the displayed and the exact value 12/47

13 Scientific software Requirements for scientific software Portability Rapid development of hardware Hardware independence Software independence Striving for high portability 13/47

14 Scientific Software Requirements for Scientific Software Efficiency Two interesting considerations when it comes to efficiency: Runtime - Efficiency Memory - Efficiency Achieving efficiency on different design levels: 1 Algorithms and data structures 2 Code optimization 3 System software 4 Hardware environment 14/47

15 Scientific Software Requirements for Scientific Software Efficiency and Reliability Unreliable efficient software is worthless! Higher costs due to unreliable software Individual, unreliable program parts can greatly increase development effort Efficiency is necessary! 15/47

16 Scientific Software Software Design Software Design - The General Procedure 1 Planning 2 Writing 3 Debug 4 Test 5 Profiling Creating a Structure Selecting Libraries Programming Language (s) Ensuring Portability GNU Debugger and Valgrind Hardware: Joint Test Action Group 6 Documentation 16/47

17 Scientific Software Software Design Profiling Analyze runtime behavior Performance - Tuning speed, memory utilization, concurrency (modern) There are various analysis techniques, including statistical evaluation of instrumentation Example: gprof 17/47

18 Scientific Software Software Design Last But Not Least: The Documentation Comments Test Documentation Documentation of the Development Task Task Solution Flowchart and Procedure Model Documentation of the Program Functions for the User Use of the Software Product Possible Error Messages and Their Handling 18/47

19 Scientific Software Software Design Examples for Libraries Builtin C / C ++ Library stdio.h math.h stdlib.h time.h GNU Scientific Library Complex Numbers Arithmetic Vector and Matrices Calculation Interpolation, Differentiation, Integration Fast Fourier Transformation Other Libraries: Numerical Recipes 19/47

20 Scientific Software Software Design Programming Languages ​​Highlevel: C FORTRAN C ++ Scripting Languages ​​Perl python shell R Symbolic Languages ​​Matlab Maple Mathematica Graphic Languages ​​LabView 20/47

21 Scientific Software Software Design Compiler 2 phases are run through: 1 analysis phase scanning syntax analysis semantic analysis 2 synthesis phase intermediate code generation program optimization code generation Example: GNU Compiler Collection 21/47

22 Hardware Hardware - overview of hardware efficiency at hardware level, storage hierarchy and locality 22/47

23 Hardware efficiency at hardware level Efficiency is one of the most important requirements for scientific software! How do you achieve efficiency at the hardware level? Interesting: storage efficiency 23/47

24 Hardware memory hierarchy Figure: Memory hierarchy 24/47

25 Computer arithmetic Computer arithmetic - overview of computer arithmetic Machine numbers Excursus - Place value systems Floating point numbers IEEE - Standard rounding and rounding errors 25/47

26 Computer arithmetic Machine numbers Computer arithmetic - machine numbers Motivation: Real numbers cannot be represented exactly in the computer Scientific calculations are, however, dependent on real numbers Representation of real numbers as machine numbers Definition (machine numbers) A finite set M R is called the set of machine numbers. 26/47

27 Computer arithmetic digression - how was that again? A little digression - how was that again? Basis Given: Set with b symbols B = {0,1, ..., b 1} b is the base / basic number bijective assignment of the digits Goal: Use symbols to represent numbers of any size The value of a digit is determined by its position in the number Place value system 27/47

28 Computer arithmetic digression - how was that again? The b-adic development For natural numbers: Definition S i = ni = 0 aibi = a 0 + a 1 banbn For whole numbers: Definition S i = ± ni = 0 aibi = ± (a 0 + a 1 banbn) For rational numbers : Definition S i = ± ni = 0 aibi ± 1 i = aibi where b denotes the base of the number and a denotes a digit from the stock of characters. 28/47

29 Computer Arithmetic Floating Point Numbers And what about real numbers? Representation with: Sign base mantissa exponent Idea: Numerical real numbers Floating point numbers First approximation for a representation: Z = Sign mantissa base exponent 29/47

30 Computer arithmetic floating point numbers Uniqueness Representation is not unique, e.g. = = = ... Normalization of the number: Definition (normalization) A number in the representation is normalized if: Z = sign mantissa base exponent 1 mantissa

31 Computer arithmetic Floating point numbers The IEEE standard computers use the binary system: Base b = 2 Precision of 32 bits (single) Precision of 64 bits (double) IEEE 754 is the best known floating point system Mantissa Exponent Single 23 Bit 8 Bit Double 52 Bit 11 Bit Table: IEEE Standard mantissa is normalized hidden bit exponent in excess / bias - representation 31/47

32 Computer arithmetic Conclusion Conclusion Extension of the definition of the machine numbers: Definition (machine numbers II) A real number x is represented in the computer as a floating point number float (x) M in the form: float (x) = (1) s (0.a 1 a 2 ... an) 2 e, a 1 0, s {0,1} It can be described as a 4 - tuple: M (2, n, MIN, MAX) where n is the number of decimal places of the mantissa and MIN and MAX are the denote the minimum / maximum number that can be represented. 32/47

33 Computer arithmetic Rounding Rounding Definition (rounding) Rounding is a mapping rd: R M with M R if the following conditions rd (m) = m, m M and r s rd (r) rd (s), r, s R 33/47

34 Calculator arithmetic Rounding rounding errors Rounding leads to rounding errors Absolute errors: rd (x) x relative error: rd (x) x x estimate for the relative error (without proof): rd (x) x x σ: = 1 2 b1 t 34/47

35 Computer arithmetic Rounding machine epsilon Definition Let x R. The smallest positive number c that can be added to x = 1.0 so that rd (1.0 + c)> 1.0 is called machine epsilon ε machine and is a measure of the computational accuracy. 35/47

36 Numerics Numerics - Overview Numerics What is Numerics? Error assessment mechanisms Condition Stability Consistency 36/47

37 Numerics What is numerics? Subfield of mathematics approximation error estimation - and minimization algorithms, their evaluation and optimization calculation with the help of the computer Figure: Numerical integration with Simpson - [Source2] 37/47

38 Numerics Condition - Stability - Consistency Error assessment Basic error assessment mechanisms: Condition of a problem Stability of an algorithm Consistency of an algorithm 38/47

39 Numerics Condition - Stability - Consistency Condition of a problem How do disturbances in the input data affect the result regardless of the selected algorithm? Given: f: D W with x D falsified x: x falsified f (x): f (x) data error effect: f (x) f (x) condition = sensitivity of the solution good condition poor condition 39/47

40 Numerics Condition - Stability - Consistency Stability of an Algorithm Stability: How robust is an algorithm against (small) disturbances in the input data? Estimating the error: f (x) f (x) Forward analysis Backward analysis Condition and stability are interrelated! This can be shown mathematically / 47

41 Numerics Condition - Stability - Consistency Consistency of an algorithm Consistency: Does the algorithm actually deal with the problem or a different one? Given: Continuous problem definition and its exact solution Numerical solution Step size A method is called consistent if there is always an error limitation depending on the selected step size. (from 41/47

42 Summary Summary Computer science supports science in structuring, processing and preparing large amounts of information. Efficiency Understanding the hardware environment can help to make programs more efficient. Real numbers are stored as floating point numbers in the computer. Condition, stability and consistency 42/47

43 Discussion / Conclusion and Outlook Discussion / Conclusion and Outlook Conclusion Overall: Very interesting topic Difficult: Giving an overview of (almost) all topics without telling too much Numerics are interesting, but difficult to understand in some areas, e.g. the error theory Outlook Hardware part turned out to be rather small Not done: Hardware of computer arithmetic - How do you achieve bit accuracy? Possibly own seminar topic? How can you improve collaboration between average programmers and scientists? Is there a suitable procedural model? 43/47

44 Thank you for your attention! 44/47

45 Sources Books Suley Oliveira, David Stewart (2006) Writing Scientific Software - A Guide To Good Style Cambridge University Press Thomas Huckle, Stefan Schneider (2002) Numerical Methods Springer, 2nd Edition Alfred V. Aho, Monica S. Lam, Ravi Sethi , Jeffrey D. Ullman Compiler Principles, Techniques, and Tools Pearson Studies, 2nd Edition 45/47

46 Sources (2) Internet pages and papers Helmut G. Katzgraber Scientific Software Engineering In A Nutshell cache / arxiv / pdf / 0905 / v2.pdf Number representations Java / number representations.pdf Numerical software Judith Segal, Chris Morris Developing Scientific Software 46/47

47 Sources (3) Image sources [Source1]: Judith Segal, Chris Morris Developing Scientific Software [Source2]: Simpson method 47/47