Reconfigurable Ultrasonic Testing System Development Using Programmable Analog Front-End and Reconfigurable System-on-Chip Hardware ()
1. Introduction
Ultrasonic systems have been widely applied in industrial and medical fields. With ultrasound frequency ranging from 1-20 MHz, the data processing performance becomes critical to the system design. Major parameters defining such a complex system include signal processing, computational speed, data transfer time and storage. Utilizing the capabilities of microprocessors and system-on-chips (SoCs) have revolutionized the implementation of embedded systems to provide more independent features and abilities in a portable hand-held unit. In this study, an efficient architecture is designed and implemented to meet the requirements of high performance ultrasonic signal processing applications. This reconfigurable ultrasonic testing system (RUTS) supports dynamic reconfiguration of the analog front-end (AFE), as well as real-time computation and data transfer by using reconfigurable System-on-Chip (SoC) analog and digital components. An experimental application of ultrasonic data acquisition and processing is realized on this system, with the Analog Front-End (AFE) built upon a reconfigurable hardware [1] .
In a typical ultrasonic sensor system, the electrical excitation pulses generated by a high voltage (HV) pulser are transformed into mechanical vibrations by an ultrasonic transducer. The resulting ultrasonic wave is transmitted on to the target material and is reflected back. This reflection termed as “echo” is digitized when channeled through a receiver, and is subsequently processed by a processing unit according to algorithms essential for imaging and evaluation of materials. The features such as the center frequency of operation, incident beam pattern, signal conditioning and data conversion are to be tuned for optimal results based on the application and the target material being evaluated [2] . However, conventional hardwired systems do not provide flexibility for such tuning. Therefore, in this study, the components such as beam controller, noise attenuator and amplifier are chosen in such a way that they allow future upgradability. Furthermore, the ultrasonic data acquisition unit must have the ability to sample the received echo with minimum distortion within the operating frequency range of the ultrasonic measurement equipment. In this system, ultrasonic transducer probes are connected to and driven by the output of a high-voltage, high-speed pulse generator LM96551 [3] , which supports up to 15 MHz operating frequency. The reflected echoes received by the system are then filtered and amplified to provide raw signal to the ADC to digitize for further ultrasonic signal-processing algorithms.
The system block diagram of RUTS consisting of a front-end sensor interface with an associated conditioning block and a back-end processing/controller unit based on Xilinx Zynq SoC is presented in Figure 1. The front- end system and back-end system is connected via serial peripheral interface (SPI). As shown in Figure 1, RUTS architecture integrates four modules: (i) a TX-SDK-V2 ultrasound transmit/receiver board from Texas Instruments (TI), which includes LM96551 pulser [3] , LM96570 beamformer [4] , and LM96530 transmit/receive (T/R) switch [5] ; (ii) a programmable low noise filter/amplifier and a programmable gain control amplifier VCA8500 [6] ; (iii) an ADC unit AD9467 from Analog Devices [7] ; and (iv) Xilinx Zedboard which integrates a Zynq SoC [8] that runs a custom Linux-based operating system and user application. TX-SDK-V2 provides customizable pulse pattern on all 8 channels. VCA8500 receives ultrasound echoes from TX-SDK-V2 and forwards to the AD9467, which interfaces with Zynq SoC through a LVDS interface. Furthermore, Zynq SoC triggers ultrasonic excitation and processes digitized echoes. The Pulser/Receiver sub-system can be broken down into two channels: Transmit (Tx) and Receive (Rx) as marked in Figure 1. The Tx channel consists of the programmable Pulser and digital beamformer. The Rx channel consists of the T/R Switch, LNP, Programmable Gain Amplifier (PGA) and the high-speed ADC.
The back-end processing sub-system employed in this study is the Xilinx Zynq SoC which integrates two ARM Cortex A9 processors running up to 667 MHz, with a programmable logic fabric connected over ARM AMBA© AXI interconnect [8] . Zynq provides an ideal platform for flexible architecture realization. A portion of resources within Zynq programmable logic can be configured as Low Voltage Differential Signaling (LVDS) which is the common interface for high-speed analog-to-digital converter (ADC). An ADC module is a key consideration in the system implementation to achieve the best signal fidelity. A DMA module can be realized within the programmable logic to manage high-speed transfer of ADC data to the memory. An Ethernet or UART interface can be deployed for the system to communicate with a host computer. By using serial peripheral interface (SPI) bus, the back-end processing system configures the front-end devices in real time, including the beamformer which generates the pulse pattern, variable gain amplifier VCA8500 and T/R switch LM96530.
In this paper, Section 2 details the RUTS process flow and the description of various components within RUTS. Section 3 describes the evaluation of RUTS by using experimental measurements and also by performing signal processing algorithm. Section 4 concludes this paper.
2. System Description
RUTS provides a flexible ultrasonic design platform for both hardware and software engineers by using existing standards. Each component is configured by the ARM processor within Zynq SoC. Figure 2 shows the RUTS process flow including the sequence of various events. After the system initialization is completed, the profile for the transmitting wave front and gain of the receive channel are configured by programming the beamformer and LNP/VGA respectively. Following this, the pulser excitation signal is fired to trigger the beamformer, and the reflected echo signal is captured and evaluated for expected signal quality by determining its signal-to-noise ratio (SNR) and/or certain timing parameters. Echo signals which do not meet the quality standards are discarded and the system is re-configured until the quality of received echo is acceptable for further signal processing. These iterations intent to use the optimum reflected echo for most accurate imaging. These events and transition between the events are directly controlled by the Zynq SoC via hardware/software co-design methods. Alternatively, a personal computer or smart phone can initiate this process through a wired channel or over the Internet. An ultrasonic diagnostic system consisting of multiple components including an ultrasonic evaluation module such as RUTS can be remotely connected via internet following the modular concept of Internet of Things by applying a specific ID to this unit and executing an internet application for initiating and remotely collecting the processed data. The component descriptions and the configurable parameters for each event in the process flow are discussed in the following sub-sections.
2.1. Data-Acquisition Unit
The analog front-end sub-system TX-SDK-V2 demonstrates a comprehensive signal transmit and receive path. The transmit side of the system contains a beamformer to generate the pulse pattern and a high-voltage (HV) pulser to drive the transducer probe. The receive side is made up of a T/R Switch to isolate the high-voltage transmit pulse and input stage of the low voltage pre-amplifier.
On the transmit side, the ultrasonic signal to be transmitted is programmed by communicating with the beamformer through SPI. The beamformer is a programmable device capable of configuring the delay pattern and pulse train required to set the desired transmit focal point up to 64 pulses with 12.5 ns pulse duration, making it suitable for a variety of ultrasonic imaging and evaluation applications [4] . This 8-channel digital beamformer generates the pulse pattern in positive and negative levels to drive the high-voltage pulser control inputs. Each channel drives a single transducer, building the capability of this system to support up to 8 transducers. The pulse pattern on each of the 8 channels: CH0 to CH7 is programmed into the internal registers of the beamformer for positive and negative pulse trains. This is used to adjust the temporal shape of the ultrasound pulse. Various configurations for beam forming and beam steering are supported by the beamformer. This helps in realizing many of the ultrasonic signal processing algorithms such as discrete wavelet transform (DWT) [9] , and split-spectrum processing (SSP) [10] .
The T/R switch provides an 8-channel programmable receive side interface [5] with each Rx switch channel driven by a HV pulser that is directly connected to a transducer. The T/R switch protects the receiver input from voltage spikes due to leakage currents flowing through the switches on Rx channel. Each T/R switch can be individually programmed ON or OFF allowing for low power operation to selectively configure desired channels through SPI.
2.2. Configurable Amplifier Unit
The next configurable component on the acquisition side is the variable gain amplifier. The VCA8500 supports up to 8 channels and can amplify a received echo with gain ranging from 40 dB to 50 dB. It consists of a low- noise pre-amplifier with a fixed gain of 20 dB and a post-gain amplifier with four gain settings: 20 dB, 25 dB, 27 dB, and 30 dB [6] .
2.3. Analog-to-Digital Converter
The AD9467 from Analog Devices is a 16 bit, 250 MSPS ADC, which is integrated in the AD9467-250EBZ FMC Evaluation Board [7] which combines a clock management chip AD9517 to generate the clock for AD9467. The wide range working frequency from 50 Hz to 250 MHz suits the board perfectly for sampling ultrasonic signal. Data path for this section is shown in Figure 3. It receives analog signal from VCA8500 and forwards the digitized 16-bit data to the back-end processing sub-system (Zynq SoC) through a LVDS interface.
2.4. Back-End Processing Unit Using Zynq SoC
The advantage of the Zynq SoC is tightly coupled with the dual-core ARM Cortex A9 processor and a programmable logic FPGA to accomplish unique and powerful designs. Each processor has separate memory management unit (MMU) and 32KB level-one instruction and data cache which permits local storage of frequently used data and instructions. The Neon single instruction multiple data (SIMD) engine in ARM processor allows acceleration of compute intensive applications, which caters well for ultrasonic testing system. FPGA by nature allows design reuse. A wide variety of IP libraries supplied by Xilinx system development kit (SDK) can shorten the design cycle and lower the cost. In this study, a few IP cores such as DMA are integrated in RUTS to capture data from high speed ADC and prepare for various algorithms.
To capture the high speed data from ADC, some efficient techniques have been applied in RUTS. As shown in Figure 4, the incoming LVDS data from ADC has 8-bit bandwidth which is not compatible with the 16-bit bandwidth of DMA. A bandwidth translator is thus designed by using the programmable logic to convert 8-bit data into 16-bit. The translated data will be transferred directly to the RAM by the DMA, without the intervention of the ARM processor for optimum system performance. DMA is controlled by the ARM processor through General Purpose AXI (GP AXI) bus, which is also known as AXI Lite. High speed data is transferred to the RAM through High Performance AXI (HP AXI) bus. In order to perform the data transfer, ARM processor needs to configure ADC and DMA. When a transmit signal is given to ADC and DMA, data from ADC will be stored to certain memory location within the RAM. Subsequently, ARM processor can access the data stored in the RAM for further processing [11] .
Another method to execute the signal processing algorithm on RUTS is to use the FPGA to process the data. As shown in Figure 5, the system includes some additional blocks to help process the data compared to the system in Figure 4. The data captured by ADC will first be processed by a pre-processing logic by converting 8-bit LVDS data into 16-bit, and performing some data processing. The partially processed data will then be transferred to the RAM. This data within the RAM will be re-organized by the ARM processor according to the application requirements. Subsequently, DMA will transfer the re-organized data from the RAM to the FPGA post-processing logic to complete the processing, which is then copied back to the RAM. By using FPGA to process the data, the computational performance can be improved tremendously.
Furthermore, to extend the capabilities of RUTS, different methods have been developed to transfer acquired data to a host computer. One direct method is to capture the acquired data in the SD card available within Zedboard, and manually port the SD card data to the computer. To utilize the powerful computation ability of a host computer and conduct real-time data processing, UART communication is employed at a baud rate of 115200. However, this method does not meet high-speed data transfer requirement for ultrasonic system. Instead, Ethernet communication with gigabyte bandwidth can be used to transmit a huge amount of data with acceptable delay.
The reconfiguration capabilities of RUTS are realized via SPI, mastered by Zynq SoC. Three devices as shown in Figure 6 receive configuration data from Zynq SoC through the same SPI port with separate slave select signals to distinguish among the devices. All three devices share a common master-out slave-in (MOSI) and clock connection. Each device has a master-in-slave-out (MISO) which is connected to the Zynq SoC. These SPI ports are initialized to known values to prevent misconfiguration of the above mentioned three devices and to ensure that the devices do not begin transmission before they are configured. Based on SPI controller features, the specific parameters of the SPI protocol for the AFE components, LM96530, LM96570 and VCA8500 are presented in Table 1. The AFE components are configured with the beamformer (LM96570) first, then the amplifier (VCA8500), and finally the T/R switch (LM96530). Since the amplifier and T/R switch support only up to 10MHz serial data transfer, the frequency for those devices is set at 7.8 MHz and that for the beamformer is set at 62.5MHz. After configuring the AFE components, the ARM processor triggers the transducer excitation and initiates the subsequent data acquisition through the ADC. The digitized data coming out of the ADC is then processed through the ARM Linux environment with the signal processing algorithm or can be accessed via a Graphical User Interface (GUI) for displaying the ultrasonic scan of the target.
Figure 4. ADC data management in Zynq SoC.
Figure 5. FPGA based data processing architecture.
Figure 6. Configuration via SPI within RUTS.
3. Evaluation of RUTS
The performance of RUTS has been evaluated by conducting experimental measurements to acquire ultrasonic
Table 1. SPI settings for AFE communication.
data from a test specimen by programming the front-end system components. Furthermore, the back-end system (Zynq SoC) is evaluated by implementing ultrasonic 3D data compression algorithm on the acquired data using discrete wavelet transform method, and analyzing the results for real-time performance.
3.1. Experimental Measurement Example
The ARM processor within Zynq SoC controls the components on the AFE via the serial peripheral interface and provides excitation for the immersion-type 3.5 MHz piezoelectric transducer. Figure 7 shows the RUTS components assembled to perform the system evaluation. Figure 8 shows the ultrasound experimental setup, where the surface of the steel block with a flaw is scanned by using the transducer. The transducer is moved with the help of a step motor [12] . Each of the AFE devices is sequentially programmed with the desired data which is given in Table 2. The selected pulse pattern in this study is shown in Figure 9(a). The corresponding echoes received from the steel block when this test pulse is sent are shown in Figure 9(b). The first burst is the output pattern of the transducer. The next four bursts (A, B, C, and D in Figure 9(b)) are the received echoes due to the reverberation in physical media. The acquired data consisting of these received echoes can be processed by applying signal processing algorithms, as demonstrated in Section 3.2.
3.2. Signal Processing Example
Signal processing algorithms are essential to every ultrasonic application. RUTS with dynamic programming ability helps to effectively apply many ultrasonic signal processing algorithms to applications such as imaging and nondestructive evaluation of materials [13] . For example, this system offers an accurate timing control for coherent averaging [14] , which is a method to improve SNR of the reflected ultrasonic echoes by averaging several echo responses over a particular time window. Another algorithm known as split spectrum processing (SSP) [15] uses multiple frequency bands for the inspection of the target. RUTS can facilitate this by programming the center frequency of the desired ultrasonic transducer excitation up to 15 MHz on the beamformer and controlling the desired pulse profile.
In this study, RUTS platform is evaluated by implementing DWT based ultrasonic 3D compression algorithm [16] on Zynq SoC as a hardware design using programmable logic, and as a software design using ARM processor.
The high speed ADC in RUTS samples the ultrasonic data at 250 MSPS with each sample having 16-bits. This results in a throughput of 500 MBPS. With all the 8 channels active, the throughput increases to 4 GBPS on a serial LVDS output from ADC. This generates huge amount of data, which needs to be compressed so that the storage can be reduced significantly, as well as the compressed data can be quickly transferred to remote locations for expert analysis. The data captured by the ADC is segmented and buffered before it is compressed by the Zynq SoC. A four level of DWT is applied to the buffered ultrasonic data to produce narrower sub-bands.
Figure 7. RUTS components integrated to perform system evaluation. [1 Zedboard, 2 TX-SDK-V2, 3 VCA, 4 Transducer, 5 AD9467 Evaluation board, 6 Isolation board to step motor, 7 Power module for the whole system].
Figure 8. Ultrasonic experimental setup.
The sub-bands which possess very low energy are eliminated to maximize the compression with high quality signal reconstruction.
3D compression is performed by successive 1D compression in each of the three axial directions. The 3D ultrasonic experimental data consisting of a volumetric image of 2048 × 128 × 128 samples (66 Mbytes when each sample is represented using 16 bits) is captured by scanning a 2 inch by 2 inch surface of a steel block test specimen by using a 5 MHz ultrasonic broadband transducer (A3062) at 100 MHz sampling rate with 8 bit resolution [16] . Figure 10 shows one original ultrasonic signal (with 2048 samples), and the reconstructed signal from the 3D compressed data. The high similarity between the original and reconstructed signal indicates the efficiency
(a)(b)
Figure 9. (a) Test pulse pattern; (b) Echo received for the test pulse.
(a)(b)
Figure 10. Original ultrasonic signal (top trace) and the 3D reconstructed signal (bottom trace).
of the compression method implemented on RUTS.
For the hardware design, the overall 3D compression with a compression ratio of 98.7% (compressing the volumetric data of 33 MBytes into 0.4 MBytes) is completed in one-fourth of a second, which indicates a very high speed compression is suitable for rapid transmission of data in real-time to remote locations via Internet. The software design requires only one minute for compressing 33 MBytes of ultrasonic data into 0.4 MBytes, which indicates that this implementation is also highly suitable for real-time ultrasonic imaging applications.
4. Conclusion
By utilizing the advancement in the field of programmable hardware, RUTS provides a flexible, adaptable and efficient platform for realizing and testing computationally intensive ultrasonic signal processing applications. By dynamically controlling the reconfigurable analog front end hardware, RUTS enables the researchers to conduct several real-time ultrasonic signal processing experiments for NDE. Zynq SoC within RUTS enables flexible implementation methods such as hardware-only design, software-only design, or hardware software co- design, by using the capabilities of powerful ARM processor along with the programmable logic. Furthermore, modular design architecture of RUTS allows replacing any of the components with an upgraded version for improved performance. Moreover, the Ethernet capability of Zynq SoC helps to remotely control the ultrasonic testing system. While offering a portable solution to ultrasonic nondestructive testing and imaging applications, RUTS is proven to support in the evolution of modern ultrasonic development systems.
Acknowledgements
The authors of this paper would like to express their special gratitude to Xilinx Inc. for providing the necessary hardware and software tools to conduct experimentation for successfully completing this research work.