School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan 611731, China
Traditional mobile sensing-based applications use extra equipments which are unrealistic for most users. Smartphones develop in a rapid speed in recent years, and they are becoming indispensable carry-on of daily life. The sensors embedded in them provide various possibilities for mobile applications, and these applications are helping and changing the way of our living. In this paper, we analyze and discuss existing mobile applications; after that, future directions are pointed out.
The word sensing builds a bridge between real world and virtual world; with the help of various sensors, man-made devices are able to feel the world like God-made creatures do. Bell may be the first generation of sensors; people tie up a bell to a string so that when there is a vibration on the string, the bell will ring. Bell is a very powerful and effective sensor; it contains two parts: detection and processing. When a bell detects a vibration, it will generate a period of ringing and the volume of the ringing is proportional to the amplitude of the vibration. However, bell is the kind of sensor that connects real world to real world. With the development of electronic devices, a new man-made world has been building. This world is called virtual world; many complicated calculations are running in this world so that people in real world can enjoy their lives. Virtual world needs data to keep running, and it is far from enough to input data into the virtual world depending on human operations. Sensor is a way to sense the world and interpret the sensed information to the data form of the virtual world; therefore, sensing becomes an important part of research field and industry field.
Early sensing-based applications are mostly used for research purposes or used in some specific areas. References [1, 2] propose localization methods for finding odor sources using gas sensors and anemometric sensors. Reference  uses a number of sensors embedded into a cyclist’s bicycle to gather quantitative data about the cyclist’s rides; this information would be useful for mapping the cyclist experience. Reference  uses body-worn sensors to build an activity recognition system, and  uses body-worn sensors for healthcare monitoring. Reference  proposes a robotic fish carrying sensors for mobile sensing. Also, in Wireless Sensor Networks (WSN), there are a lot of sensing-based applications. References [7, 8] deploy wireless sensors to track the movement of mobile objects. References [9, 10] deploy sensors for monitoring volcano.
People-centric sensing mentioned in  uses smartphones for mobile sensing. Smartphones are very popular and becoming indispensable carry-on for people in recent years; they are embedded with various sensors which could be used for many interesting applications. Unlike specific sensors which are used for specific areas, sensors in smartphones could provide unlimited possibilities for applications to help and change the life of people; also, using smartphone instead of specific equipment makes an application easier to be accepted by users.
In this paper, we will discuss some existing interesting sensing-based applications using smartphones and give some possible future directions. Section 2 gives detailed descriptions of sensors embedded in modern smartphones; Section 3 introduces some sensing-based applications; Section 4 gives a conclusion and future directions.
2. Sensors in Smartphones
As Figure 1 shows, modern smartphones have several kinds of sensors. The most popular sensors which most smartphones have are accelerometer, gyroscope, magnetometer, microphone, and camera. In this section, we will discuss the characteristics of the sensors.
Figure 1: Sensors inside of smartphones.
An accelerometer measures proper acceleration, which is the acceleration it experiences relative to free fall and is the acceleration felt by people and objects. To put it another way, at any point in spacetime the equivalence principle guarantees the existence of a local inertial frame, and an accelerometer measures the acceleration relative to that frame. Such accelerations are popularly measured in terms of g-force .
The principle of accelerometer is using inertial force. Try to imagine a box with six walls, a ball is floating in the middle of the box because no force is added to the ball (e.g., the box may be in the outer space) . When the box moves to the right direction, the ball will hit the left wall. The left wall is pressure sensitive that it can measure the force of hitting applied to the left wall; therefore, the acceleration can be measured. Because of gravity, when the box is placed at earth, the ball will keep pressing the bottom wall of the box and give constant ~9.8 m/s2 acceleration. The gravity force will affect the measurement of accelerometer for measuring speed or displacement of an object in a three-dimension. The gravity force must be subtracted before any measurement. However, the gravity force can be taken as an advantage of detecting the rotation of a device. When a user rotates his smartphone, the content he/she is watching will switch between portrait and landscape. As Figure 2shows, when the screen of smartphone is in a portrait condition, -axis will sense the gravity; when the screen of smartphone is in a landscape condition, -axis will sense the gravity. According to this, users can rotate their screens without affecting their reading experiences.
Figure 2: Screen rotation.
In theory, the displacement can be calculated aswhere : displacement, : initial displacement, : initial velocity, and : acceleration.
Equation (1) is a continuous function; the we get in real environment is discrete due to sampling. To calculate the displacement according to discrete values, (2) has to be used aswhere : continuous acceleration, : th sample, and : time increment.
Then, the velocity and displacement can be calculated as the following :
The value the accelerometer returns is three-dimensional as Figure 2 shows; therefore, will be calculated as the following shows:where , and are vectors.
Accelerometer is good at measuring the displacement of an object; however, it is inaccurate to measure the spin movement of the device, which is an easy thing for gyroscope.
A gyroscope is a device for measuring or maintaining orientation, based on the principles of angular momentum. Mechanically, a gyroscope is a spinning wheel or disk in which the axle is free to assume any orientation. Although this orientation does not remain fixed, it changes in response to an external torque much lesser and in a different direction than it would be without the large angular momentum associated with the disk’s high rate of spin and moment of inertia. The device’s orientation remains nearly fixed, regardless of the mounting platform’s motion because mounting the device in a gimbal minimizes external torque .
Gyroscope is a very sensitive device; it is good at detecting the spin movement. Same as accelerometer, gyroscope returns three-dimensional values; the coordinate system is as Figure 2 shows. The value gyroscope returns is angular velocity which indicates how fast the device rotates around the axes. The gyroscope can be calculated aswhere : angular velocity and : vectors of angular velocity around x-, y-, and z-axes.
A magnetometer is a measuring instrument used to measure the strength and perhaps the direction of magnetic fields . Accelerometer and gyroscope are able to detect the direction of a movement; however, the direction is a relative direction; it obeys the coordinate system that a smartphone uses. Sometimes, different smartphones need to synchronize their directions; therefore, a magnetometer is needed to get an absolute direction (the direction obeys the coordinate system of earth).
The magnetometer returns three-dimensional values; if the device is placed horizontally, the orientation angle can be calculated as
Until now, we introduced three types of sensors: accelerometer, gyroscope, and magnetometer. With the help of the three types of sensors, smartphone can estimate its own all kind of movements. However, in real environment, errors of measurement happen all the time; we will talk about a way to correct the offset error of magnetometer, and the other two sensors may use the same way to correct their errors.
Firstly, place the magnetometer horizontally, rotate it in a uniform speed, measure the value of and , put axis horizontally, and rotate to measure the value . Calculate the offset value on the three axes as
Microphone is a very common sensor; it is usually used for recording sound. The problem is how to deal with the recorded sound. The most common way is to find a known period of sound in a sound record. Cross-correlation is a method to search a piece of signal in a mixed signal . In signal processing, cross-correlation is a measure of similarity of two waveforms as a function of a time lag applied to one of them. This is also known as a sliding dot product or sliding inner product. It is commonly used for replacing a long signal for a shorter ones, which is known feature. It also has applications in pattern recognition, single particle analysis, electron tomographic averaging, cryptanalysis, and neurophysiology .
Cross-correlation can be calculated as (8) shows. Suppose that the known symbol pattern of sound wave of turn signal is of length is the complex number representing the received symbol, then the cross-correlation at a shift position is
If we use a sound wave to cross-correlate itself, for example, the sound wave as Figure 3 shows, the result will be shown as in Figure 4. The spike indicates the existence of a piece of signal.
Figure 3: A sound record of turn signal of automobile.
Figure 4: Cross-correlation.
Camera captures vision information from real world. From the human perspective, vision contains most information we get. However, pattern recognition in computer area is not mature enough to work as human does. In this section, we will briefly introduce principle the pattern recognition.
A photo the camera records can be expressed as a matrix of light intensity of each pixel (here we take the grey-mode photo as an example). Suppose that the source matrixes (or as we call them dictionary) are, as the matrix needed to be recognized is , then the pattern recognition is proceeded as (9) shows. matrix in the dictionary is the result:
Pattern recognition is far more complicated than (9); there are many good algorithms in pattern recognition area, like SIFT [19–23] and SVM [24–29]. But the recognition rate is still not good enough for practical applications.
In this section, we will introduce a few interesting sensing-based applications using smartphones. We divide the applications we are going to discuss into two categories: accelerometer, gyroscope, and magnetometer; microphone and camera.
3.1. Accelerometer, Gyroscope, and Magnetometer
3.1.1. Trace Track
Searching people in a public place is difficult; for example, a person is in a conference hall, a library, or a shopping mall, there are many crowds around, it is very difficult to find the target person. Even if the person tells you where he/she is, it is frustrating to find the place in an unfamiliar environment. Maps may be helpful but not always handy. Smartphones provide possibilities to develop an electronic escort service using opportunistic user intersections. Through periodically learning the walking trails of different individuals, as well as how they encounter each other in space time, a route can be computed between any pair of persons . The escort system could guide a user to the vicinity of a desired person in a public place. Escort does not rely on GPS, WiFi, or war driving to locate a person—the escort user only needs to follow an arrow displayed on the phone .
Escort system presents an interest idea that a smartphone is an escort which tells you how many steps you walk and which direction you are heading to , see Figure 5(a). GPS is a way to achieve the idea; however, GPS cannot work in an indoor environment; WiFi localization is a good indoor localization method; however, it cannot guarantee, there are enough WiFi fingerprints for localization. Escort system uses accelerometer and gyroscope to achieve the idea. However, the displacement calculated based on accelerometer is inaccurate; the reasons are the jerky movement of smartphone in people’s pocket and inherent error of measurement [31–33]; the displacement error may reach 100 m after 30 m walk, see Figure 5(b). Escort identifies an acceleration signature in human walking patterns to avoid the problem. This signature arises from the natural up- and down-bounces of the human body while walking and can be used to count the number of steps walked . The physical displacement can then be computed by multiplying step count with the user’s step size which is, a function of the user’s weight and height , see Figure 5(c). Escort varies the step size with an error factor drawn from a Gaussian distribution centered on 0 and standard deviation 0.15 m . This better accommodates the varying human step size .
Figure 5: (a) Accelerometer readings (smoothened) from a walking user. (b) Displacement error with double integration for two users. (c) Error with step count method.
Compass (magnetometer) is also used in the escort system. Like accelerometer has measurement errors, the compass noise is caused by several factors, including user sway, movement irregularities, magnetic fields in the surroundings, and the sensor’s internal bias. Because these factors are related to the user, surroundings, and the sensor, the noise is difficult to predict and compensate. To characterize the compass noise, escort ran 100 experiments using 2 Nokia 6210 Navigator phones and has observed an average bias of 8 degrees and a standard deviation of 9 degrees . In addition to this large noise range, escort made two consistent observations as follows: when the user is walking in a constant direction, the compass readings stabilize quickly on a biased value and exhibit only small random oscillations, and after each turn, a new bias is imposed . Escort identifies two states of the user: walking in a constant direction and turning based on the two observations. Turns are identified when the compass headings change more significantly than due to random oscillations. The turn identification algorithm uses the following condition :where denotes the average compass readings over a time period (e.g., 1 second), is the standard deviation of compass readings during , and is a guard factor. While on a constant direction, escort compensates the stabilized compass reading with the average bias and reports the resulting value as the direction of the user. During turns, escort considers the sequence of readings reported by the compass .
Similar usage of accelerometer and magnetometer appears in . It uses accelerometer to estimate the walking trace of people and correct the trace periodically by GPS. Accelerometer used as a supplement of GPS localization is a popular way, and many researches focused on this [35–40].
Smartphone is not only used on walking people but also on people on vehicles.
3.1.2. Dangerous Drive
When drivers are sitting in a vehicle, their smartphones are able to measure the acceleration, velocity, and turns through sensors embedded. Because the smartphone is very popular, using smartphone is an easy way to implement mobile sensing.
Driving style can be divided into two categories: nonaggressive and aggressive. To study the vehicle safety system, we need to understand and recognize the driving events. Potentially aggressive driving behavior is currently a leading cause of traffic fatalities in the United States, and drivers are unaware that they commit potentially aggressive actions daily . To increase awareness and promote driver safety, a novel system that uses Dynamic Time Warping (DTW) and smartphone-based sensor fusion (accelerometer, gyroscope, magnetometer, GPS, and video) to detect, recognize, and record these actions without external processing being proposed .
The system collects the motion data from the the accelerometer and gyroscope in the smartphone continuously at a rate of 25 Hz in order to detect specific dangerous driving behaviors. Typical dangerous driving behaviors are hard left and right turns, swerves, and sudden braking and acceleration patterns. These dangerous driving behaviors indicate potentially aggressive driving that would cause danger to both pedestrians and other drivers. The system first determines when a dangerous behavior starts and ends using endpoint detection. Once the system has a signal representing a maneuver, it is possible to compare it to stored maneuvers (templates) to determine whether or not it matches an aggressive event .
References [42–44] use accelerometer and gyroscope for driving event recognition. Reference  includes two single-axis accelerometers, two single-axis gyroscopes, and a GPS unit (for velocity) attached to a PC for processing. While the system included gyroscopes for inertial measurements, they were not used in the project. Hidden Markov Models (HMM) were trained and used only on the acceleration data for the recognition of simple driving patterns. Reference  used accelerometer data from a mobile phone to detect drunk driving patterns through windowing and variation threshold. A Driver Monitor System was created in  to monitor the driving patterns of the elderly. This system involved three cameras, a two-axis accelerometer, and a GPS receiver attached to a PC. The authors collected large volumes of data for 647 drivers. The system had many components, one of them being detection of erratic driving using accelerometers. Braking and acceleration patterns were detected, as well as high-speed turns via thresholding. Additionally, the data could be used to determine the driving environment (freeway versus city) based on acceleration patterns.
References [45–47] use accelerometer and gyroscope to detect gesture recognition. The uWave paper  and the gesture control work of Kela et al.  explore gesture recognition using DTW and HMM algorithm,s respectively. They both used the same set of eight simple gestures which included up, down, left, right, two opposing direction circles, square, and slanted 90-degree angle movements for their final accuracy reporting. Four of these eight gestures are one-dimensional in nature. The results proved that using DTW with one training sample was just as effective as HMMs.
3.1.3. Phone Hack
Accelerometer is so sensitive that it can even detect finger press at the screen of the smartphone. Reference  shows that the location of screen taps on modern smartphones and tablets can be identified from accelerometer and gyroscope readings. The findings have serious implications, as we demonstrate that an attacker can launch a background process on commodity smartphones and tablets, and silently monitor the user’s inputs, such as keyboard presses and icon taps . While precise tap detection is nontrivial, requiring machine-learning algorithms to identify fingerprints of closely spaced keys, sensitive sensors on modern devices aid the process . Reference  presents TapPrints, a framework for inferring the location of taps on mobile device touchscreens using motion sensor data combined with machine-learning analysis. By running tests on two different off-the-shelf smartphones and a tablet computer, the results show that identifying tap locations on the screen and inferring English letters could be done with up to 90% and 80% accuracy, respectively . By optimizing the core tap detection capability with additional information, such as contextual priors, the core threat may further be magnified .
Figure 6 shows a sample of the raw accelerometer data after a user has tapped on the screen—the timing of each tap marked by the digital pulses on the top line. As we can see, each tap generates perturbations in the accelerometer sensor readings on the three axes, particularly visible along the -axis, which is perpendicular to the phone’s display. Gyroscope data exhibits similar behavior and is not shown . Some related works [50–52] are also about using accelerometer and gyroscope for phone hacks.
Figure 6: The square wave in the top line identifies the occurrence of a tap. Two particular taps have also been highlighted by marking their boundaries with dashed vertical lines. Notice that the accelerometer sensor readings (on the -axis in particular) show very distinct patterns during taps. Similar patterns can also be observed in the gyroscope.
3.1.4. Phone Gesture
The first two sections are about using accelerometer, gyroscope, and magnetometer to detect a long-distance movement; actually, if the sensors are used to detect gestures, some interesting applications may appear.
The ability to note down small pieces of information, quickly and ubiquitously, can be useful. Reference  proposes a system called PhonePoint Pen that uses the in-built accelerometer in mobile phones to recognize human writing. By holding the phone like a pen, a user should be able to write short messages or even draw simple diagrams in the air. The acceleration due to hand gestures can be converted into an image and sent to the user’s Internet email address for future reference .
The work is done without gyroscope, because the smartphone it used lacks gyroscope. There are two main issues to be solved: coping with background vibration (noise); computing displacement of phone.
Accelerometers are sensitive to small vibrations. Figure 7(a) reports acceleration readings as the user draws a rectangle using 4 strokes (around −350 units on the -axis is due to earth’s gravity). A significant amount of jitter is caused by natural hand vibrations. Furthermore, the accelerometer itself has measurement errors. It is necessary to smooth this background vibration (noise) in order to extract jitter-free pen gestures. To cope with vibrational noise, the system smoothes the accelerometer readings by applying a moving average over the last nreadings (). The results are presented in Figure 7(b) (the relevant movements happen in the -plane, so we removed the -axis from the subsequent figures for better visual representation) .
Figure 7: (a) Raw accelerometer data while drawing a rectangle (note gravity on the -axis). (b) Accelerometer noise smoothing.
The phone’s displacement determines the size of the character. The displacement is computed as a double integral of acceleration, that is, , where is the instantaneous acceleration. In other words, the algorithm first computes the velocity (the integration of acceleration) followed by the displacement (the integration of velocity). However, due to errors in the accelerometer, the cumulative acceleration and deceleration values may not sum to zero even after the phone has come to rest. This offset translates into some residual constant velocity. When this velocity is integrated, the displacement and movement direction become erroneous. In order to reduce velocity-drift errors, the velocity to zero at some identifiable points needs to be reset. The stroke mechanism described earlier is therefore used. Characters are drawn using a set of strokes separated by short pauses. Each pause is an opportunity to reset velocity to zero and thus correct displacement. Pauses are detected by using a moving window over consecutive accelerometer readings () and checking if the standard deviation in the window is smaller than some threshold. This threshold is chosen empirically based on the average vibration caused when the phone was held stationary. All acceleration values during a pause are suppressed to zero. Figure 8(a) shows the combined effect of noise smoothing and suppression. Further, velocity is set to zero at the beginning of each pause interval. Figure 8(b) shows the effect of resetting the velocity. Even if small velocity drifts are still present, they have a small impact on the displacement of the phone .
Figure 8: (a) Accelerometer readings after noise smoothing and suppression. (b) Resetting velocity to zero in order to avoid velocity drifts.
3.2. Microphone and Camera
3.2.1. Surround Sense
There are some research works [54–57] about using smartphones to sense the context of surroundings. A growing number of mobile computing applications are centered around the user’s location. The notion of location is broad, ranging from physical coordinates (latitude/longitude) to logical labels (like Starbucks, McDonalds). While extensive research has been performed in physical localization, there have been few attempts to recognize logical locations. Reference  argues that the increasing number of sensors on mobile phones presents new opportunities for logical localization. Reference  postulates that ambient sound, light, and color in a place convey a photoacoustic signature that can be sensed by the phone’s camera and microphone. In-built accelerometers in some phones may also be useful in inferring broad classes of user motion, often dictated by the nature of the place. By combining these optical, acoustic, and motion attributes, it may be feasible to construct an identifiable fingerprint for logical localization. Hence, users in adjacent stores can be separated logically, even when their physical positions are extremely close  see Figures 9 and 10.
Figure 9: Sound fingerprints from 3 adjacent stores.
Figure 10: Color/light fingerprint in the HSL format from the Bean Traders’ coffee shop. Each cluster is represented by a different symbol.
Reference  takes advantage of microphone and camera to collect the surrounding fingerprint so that it can provide logical localization.
References [58–60] use audio for localization and the accuracy is improved to a high level. Reference  operates in a spontaneous, ad hoc, and device-to-device context without leveraging any preplanned infrastructure. It is a pure software-based solution and uses only the most basic set of commodity hardware—a speaker, a microphone, and some form of device-to-device communication—so that it is readily applicable to many low-cost sensor platforms and to most commercial-off-the-shelf mobile devices like cell phones and PDAs . High accuracy is achieved through a combination of three techniques: two-way sensing, self-recording, and sample counting. To estimate the range between two devices, each will emit a specially designed sound signal (“Beep”) and collect a simultaneous recording from its microphone . Each recording should contain two such beeps, one from its own speaker and the other from its peer . By counting the number of samples between these two beeps and exchanging the time duration information with its peer, each device can derive the two-way time of flight of the beeps at the granularity of the sound sampling rate . This technique cleverly avoids many sources of inaccuracy found in other typical time-of-arrival schemes, such as clock synchronization, non-real-time handling, and software delays .
Reference  sends beep sound to calculate the distance between two objects; the microphone will distinguish original sound and echo to get the time interval and, therefore, get a distance, see Figure 11.
Figure 11: Illustration of event sequences in BeepBeep ranging procedure. The time points are marked for easy explanation and no timestamping is required in the proposed ranging mechanism.
4. Discussions and Conclusion
Sensors are the key factors of developing more and more interesting applications on the smartphones, and the sensors make the smartphone different from traditional computing devices like computer. Most applications used accelerometer and gyroscope because they are somehow the most accurate sensors. However, the vision contains huge information. We believe that camera and pattern recognition will be used more and more in the future.
This work is supported by National Science Foundation under Grant nos. 61103226, 60903158, 61170256, 61173172, and 61103227 and the Fundamental Research Funds for the Central Universities under Grant no. ZYGX2010J074 and ZYGX2011J102.