CORDICThe calculus courses provide us with tools to compute the values of trigonometric functions,for example,via series expansions, polynomial,and rational function approximations.However,these implementations tend to require multiplication and division operations that make them expensive in hardware.In contrast,CORDIC(COrdinate Rotation Digital Computer)algorithms need only adders,shifters and comparators for computing a wide range of elementary functions.The method is especially efficient whenfixed point implementations of signal processing algorithms on hardware are considered.For example,CORDIC is extremely popular in hardware accelerators and also in SIMD(Single-Instruction Multiple Data)realizations.Furthermore,almost all function calculators employ CORDIC. Intel processors used CORDIC for trigonometric functions till80486.CORDIC is a good choice for hardware solutions such as FPGA in which cost(gate count)minimization is more important than throughput maximization.In software implementations CORDIC enables most of the code and data be shared between routines for trigonometric and hyperbolic functions,helping to conserve memory.CORDIC algorithm is often used to implement rotations needed in modulators and demodulators.CORDIC algorithm was introduced in1959by Volder for implementing a real-time navigation computer for aeronautical appli-cations.The algorithm was initially formulated for computing the values of trigonometric functions.In early1970s the CORDIC techniques were extended to exponential,logarithm,forward and inverse circular and hyperbolic functions,ratios and square roots(Walther1971).Its concepts have also been developed to include calculation of the Discrete Fourier Transform(Despain 1974).More recently(Bajard et al1994)efficient hardware technique known as BKM for computing complex exponentials and trigonometric functions was proposed and has since been very widely applied.13.1CORDIC Fundamentals:Vector RotationIn this section,we consider computation of vector rotation illustrated in Fig.1,which maps vector(x,y)to(x′,y′)according to the equationsx′=x cosφ−y sinφ(1)y′=y cosφ+x sinφwhereφis a rotation angle.Note that four multiplications and two additions are needed to compute x′and y′provided that the values of cosφand sinφare available.1If this is not the case,we might consider computing them digitally by series expansion, which is obviously complex.Figure1:Rotation of vector(x,y)vector byφdegrees.3.1.1Concatenation of rotationsAs afirst step towards the CORDIC implementation,we note that ifφ=φa+φb,we mayfirst map(x,y)to(x′′,y′′)using the angleφa,and then map(x′′,y′′)to(x′,y′)using the angleφb.So,it is possible to concatenate mappings for angles φi,(i=0,...,N−1)in order to evaluate the mapping forφ= N−1i=0φi.In the following,we will denote with(x i,y i)and(x i+1,y i+1)the input and output to the rotation byφi:x i+1=x i cosφi−y i sinφi(2)y i+1=y i cosφi+x i sinφi3.1.2Applying arithmetic shift and additionTo proceed,let us assume that−π/2<φi<π/ing tanφ=sinφ/cosφ,equation(2)can be rewritten asx i+1=cosφi(x i−y i tanφi)(3)y i+1=cosφi(y i+x i tanφi)which suggests the computational structure shown in Fig.2.Note that cosφi=cos(−φi)and tanφi=−tan(−φi),so the mapping for a negative angle−φi is the same as forφi except the change of signs in the terms involving the tangent. Multiplication by a power of two corresponds to the arithmetic shift operation,which is cheap to implement.The main idea of the CORDIC algorithm is that multiplication by tanφi can be based on shifting,whentanφi=±2−i,i∈{0,1,2,...}.Under this condition,(3)becomesx i+1=cosφi(x i−d i·y i·2−i)y i+1=cosφi(y i+d i·x i·2−i)(4)3Figure2:Organizing computations of the transform.For some specific anglesφi,multiplication by tanφi can be replaced by an arithmetic shift and some sign manipulation.where d i=+1,ifφi>0,and d i=−1,ifφi<0.Thus,substituting d i=−1for d i=+1corresponds to swapping of signs of the second terms within parentheses,that is,subtraction becomes addition and vice versa.3.1.3Gain compensationHowever,(4)contains still multiplications by cosφi,and if several rotations were concatenated,we would have lots of multipli-cations.To solve the problem,wefirst notice thatφi=arctan(2−i),i∈{0,1,2,...}and√cosφi=cos(arctan(2−i))=1/√Then we divide both sides of(4)by cosφi=1/√1+2−2i.Now,the right hand sides contain just the the shift-and-addition parts of computation,and we get x i+1 and y i+1amplified by the gain a i.To see,how to compensate for the gain,let us multiply by a constant A i both sides in(5),and let A i+1=a i A i.As a result,we get recursive equationsx i+1·A i+1=x i A i−d i·y i A i·2−iy i+1·A i+1=y i A i+d i·x i A i·2−i.(6) These equations give the output of a chain of blocks,where just the shift-add parts of the rotations are computed,and multiplications by cosφi are neglected.After N such steps,we must multiply the results by1/A N to get x N and y N.The value of the gain A N can be calculated usingA N=N−1 i=0√Table1:The gain values tabulated till iteration9.Iteration i Iteration i0516273849component might not be needed as d i’s can be precomputed and tabulated.In the table below we have precalculated thefirst six entries of an arctan(2−i)look-up table.Table2:Precalculated thefirst six entries of an arctan(2−i).Iteration i arctan(2−i)[deg]0.7854126.570.245037.130.06245 1.793.2Using the CORDIC algorithmThe CORDIC algorithm is usually employed in two ways with the number of iterations isfixed in the beginning:1.rotate an input vector by given angle and output the resulting vector2.rotate the given input vector onto the x axis and output the require rotation angle73.2.1Example:Rotating an input vectorLet us initialize z i with the value of the desired rotation angle and then start the iterations of the algorithm using the following equationsx i+1·a i=x i−d i·y i·2−iy i+1·a i=y i+d i·x i·2−iz i+1=z i−d i·arctan(2−i)whered i= −1if z i<0+1otherwiseThe following is an example on rotating an input vector by30degreesFor the reasons of simplicity we initially calculate the resulting angle at four bits of precision(for some reason this expression is used in literature although the relationship between the angle and2−i is a non-linear one).Initially,we let z0=30and start iterating the angle calculations.Notice that this gives us the values of d(direction)needed in calculating the vector coordinates at each iteration.i d i z i+1[deg][deg]0+130.00−45.00=−15.00−15.0026.570.500/0.10002+111.57−14.04=−2.47−2.477.130.125/0.00104......We assume(x0,y0)=(1,0),and notice that z0=30is greater or equal to0,so d0=1,and getx1=x0−d0·y0·2−0=1y1=y0+d0·x0·2−0=1z1=z0−d0·arctan(2−0)=−15degreesNow z1=−15<0,so d1=−1,and the next iteration becomesx2=x1−d1·y1·2−1=1.5y2=y1+d1·x1·2−1=1.5z2=z1−d1·arctan(2−1)=11.57degreesFor convenience,in this example the outcomes of the calculations are presented decimal to avoid the additional understanding effort from binary-to-decimel conversions.In reality,everything should be done in binary arithmetics.As the angle z2is now positive,d2=1.Thenx3=x2−d2·y2·2−2=1.375y3=y2+d2·x2·2−2=0.875z3=z2−d2·arctan(2−2)=−2.47degreesThe angle z3is negative,so d3=−1.We getx4=x3−d3·y3·2−3=1.484y4=y3+d3·x3·2−3=0.703z4=z3−d3·arctan(2−3)=4.66degreesFinally,we perform(a rough)gain correction to obtain0.6073·x4=0.9010.6073·y4=0.4269We can repeat the above calculation in binary representation in which the multiplications by2−i become respective arithmetic shifts to the right.3.3Angle/radius calculation using rotatorThe rotator algorithm can be modified to compute the polar coordinates of the Cartesian coordinates2.The idea is to determine rotationφfor which y′=0in(1).Let us denote the unknown radius with r and the unknown angle withα.When y′=0,φmust be−α,and thereforex′=x cosφ−y sinφ=(r cosα)cos(−α)−(r sinα)sin(−α)=r,so after rotation x′will give the unknown radius(see Fig.4).3.3.1Determining rotation directionsLogic for determining the rotation direction d i must be based on checking the value of y i.Let us assume that the point(x i,y i) lies in the I or IV quadrant,that is,x i≥0.We note that if y i<0we should rotate counterclockwise,and if y i>0,we should rotate clockwise.Let z i correspond to the sum of rotations in clockwise direction,that is,it will converge toα.Then,we have the angle accumulation formulaz i+1=z i−d i·arctan(2−i)(10) with the decision rule(11)d i= −1if y i A i>0+1otherwise.Figure4:Rectangular to polar transform problem:determine the transform,which maps(x,y)to a point on x-axis.3.3.2Extending the range of x coordinateTo use the algorithm for negative x,we must perform an initial rotation which gives x0≥0.One approach is just to consider the sign of ly,if y≥0and we rotate by−π/2,then x0=−y sin(−π/2)=y≥0.On the other hand,if y≤0and we rotate byπ/2,then x0=−y≥0.d init= −1if y>0+1otherwise.(12) and the initial values arex0=−d init·y y0=d init·x z0=−d init·π/2(13)With this initialization,the result z N will converge toαin the range[−π,π].113.3.3The algorithmTo summarize,the rotator transform defined in(1)can be solved with the following steps:ing(12)and(13),compute the initial values of x0=x0A0,y0=y0A0and z0.2.Iterate N times(i=0,...,N−1):(a)Determine the rotation direction d i using(11).(b)Compute x i+1A i+1and y i+1A i+1using shift-add arithmetic based on(6).(c)Update the sum of angles,z i+1,using(10).pensate for the gain A N in the result to obtain x N,which corresponds to the radius coordinate r.z N gives the anglecoordinateα.Example2.Let(x,y)=(3,4).As y>0,d init=−1.First iterations of the algorithm are shown in Fig.5.As A5=1.6457, x5after gives quite accurately the radius r=i y i A i d i[deg][deg]0−3.00+1+7.00+45.0026.572−2.50+1+8.13+57.537.134+0.39−1+8.23+53.98Figure5:CORDIC iterations for mapping(3,4)to polar coordinates.3.4Implementation aspectsAn important notice:in particular the inverse trigonometric function calculations using the simpliest CORDIC algorithms may in some cases suffer from numerical precision problems.Consult the literature for techniques that avoid the problems.They are only slightly more complicated.The benefits of CORDIC algorithms are usually best achived in hardware and SIMD implementations.A rule of thumb in scalar software implementations is that if a hardware multiplier is available,it should be used.Then the trigonometric functions can be conveniently and efficiently computed via series expansions.Notice,however,that in many cases the CORDIC algorithms may converge faster!Many practical CORDIC implementations are based on bit serial binary arithmetics,because bit parallel operations and iterative implementations require a versatile arithmetic shifter(barrel shifter).A pipelined multistage realization may then be an attractive alternative as such an approach also provides a higher throughput.There are several possible optimizations available with CORDIC implementations.For instance,the rotation angles may be13known beforehand.This is the case when rotators are parts in DCT transform computations.Then we dont need to accumulate the angles,and we only need to know the sequence of directions d i.Furthermore,if the sign of the direction may get values-1,0,1,the number of iterations can be reduced.Figure6:A sketch of CORDIC hardware in which the iteration loops have been rolled open.143.4.1Historical noteVolder wasnt thefirst one to invent the CORDIC algorithm.Henry Briggs(1561-1630)in his book Arithmetica Logaritmica in 1624(!),described similar methods to calculate tables of logarithms(btw,invented by Neper,1550-1617).In his start of studies in1977Olli spent a fortune on a programmable HP calculator,and he was astonished to read from the manual that the algorithm for computing the logarithms etc.was based on a method devised by Briggs more than350year earlier.If you are interested you canfind a partial English translation of Arithmetica Logaritmica at /his-tory/Miscellaneous/Briggs/index.html3.5CORDIC example in binary formIn the following we present the earlier example in a binary form progressing through the solution step by step.For the purpose of demonstration we will use s9.5fixed point number format recognizing that this selection may not yield the most precise results.However,this enables us to see some of the difficulties in maintaining the precision.3.5.1ScalingStarting from the easiest element,the scale factor to compensate for CORDIC algorithm gain when the number of iterations is infinite is approximately0.6073.As this scale factor may be needed to correct for thefinal results of CORDIC viax=0.6073·x iy=0.6073·y i15we notice that it would be convenient to be able to implement the multiplication with arithmetic shifts and adds.We immediately see that0.6073·x≈0.5·x+0.125·x−0.015625·x=0.6093·xHere the error due to scaling is approximately0.33%if the number of iterations is infinite.That can be implemented as the following combination of shifts and adds(ashr(x,n)is arithmetic shift right of x by n bits and, ashl(x,n)is arithmetic shift left of x by n bits):0.6073·x≈ashr(x,1)+ashr(x,3)−ashr(x,6)Now,if x=1that is expressed as0001.00000in binary s9.5format,we get0.6073·x≈0000.10000+0000.00100+0000.00000=0000.10100=0.625The error is2.9%,10times the one with our decimal experiment!Another alternative that results in a better precision with s9.5format is0.6073·x≈ashr[(ashl(x,2)+x−ashr(x,3)),3]=ashr[0100.00000+0001.00000+1111.11100,3]=ashr[0100.11100,3]=0000.10011=0.59375The error is2.2%is now smaller,but still a sizable one.In practical implementations we simply connect the signal lines in a manner that corresponds to each shift.3.5.2Using the CORDIC iterations for vector rotationOur task is to rotate an input vector by30degrees.So,first z0=30=0.523rad=0000.10001.What are the errors in this case?16i d i z i+1rad/s9.5rad/s9.50+10000.10001−0000.11001=0111.110000000.011110.500/0.10002+10000.001000.125/0.00104......=0.25=−14.33degrees32We may recall that in our decimal arithmetics solution the result for the angle in thefirst iteration was−15degrees.However, there is nothing to worry,the error is just a characteristic of binary arithmetics.When designing a CORDIC implementation we need to pay careful attention to the precision of the solution.In theory,each iteration gives one more bit of precision,but the tabulated atan(2−i)values set the limit for the number of iterations.When the number of iterations increases,the more precision is needed for the atan(2−i)to determine the direction of rotations. At this point it is also good to remember that the location of the decimal point is only an agreement.As a reminder of the reality,we do not mark it in the binary words in the formulas.17Now z1=111111000<0,so d1=−1,and the next iteration becomesx2=x1−d1·y1·2−1=000100000+ahsr(000100000,1)=1.5y2=y1+d1·x1·2−1=000100000+ahsr(000010000,1)=0.5z2=z1−d1·arctan(2−1)=000000111==7。