Lidar mengungkapkan aktivitas anomali vektor malaria selama gerhana pan-Afrika

Lidar reveals activity anomaly of malaria vectors during pan-African eclipse


Yearly, a quarter billion people are infected and a half a million killed by the mosquito-borne disease malaria. Lack of real-time observational tools for continuously assessing the unperturbed mosquito flight activity in situ limits progress toward improved vector control. We deployed a high-resolution entomological lidar to monitor a half-kilometer static transect adjacent to a Tanzanian village. We evaluated one-third million insect observations during five nights, four days, and one annular solar eclipse. We demonstrate in situ lidar classification of several insect families and their sexes based on their modulation signatures. We were able to compare the fine-scale spatiotemporal activity patterns of malaria vectors during ordinary days and an eclipse to disentangle phototactic activity patterns from the circadian mechanism. We observed an increased insect activity during the eclipse attributable to mosquitoes. These unprecedented findings demonstrate how lidar-based monitoring of distinct mosquito activities could advance our understanding of vector ecology.


Malaria is a predominantly tropical disease caused by Plasmodium parasites mainly transmitted by Anopheles mosquitoes. The infectious disease slows the development of the African continent (12), and the annual human death tolls reach almost half a million, thereby ranking mosquitoes the deadliest animal on Earth (34). While malaria risk is exacerbated by poverty and poor housing, especially in rural areas, Africa is disproportionately affected because of several endemic human-specialized mosquito species that are consequently exceptionally efficient vectors (56). Despite progress diminishing malaria deaths (7) and large eradication efforts (810), low-cost diagnostics (1114), and treatment (15), malaria remains one of Africa’s major challenges. Because of the rapid evolution of both the parasite (16) and the mosquitoes (17), resistance can develop rapidly (1820), circumventing pesticide use, vaccination programs, and bed-netting campaigns (21).

Further progress toward malaria vector control and elimination will undoubtedly require new measures that target all life stages of the vectors (2224), notably those that occur outdoors and are widely distributed in the landscape. It will also require a greatly improved understanding of mosquito population ecology, so that the design and deployment of new tools may be rationally optimized. For example, detailed studies of population dynamics in the Sahel region revealed novel opportunities for targeting aestivating mosquitoes with residual insecticides during the dry season (25). However, detecting and quantifying wild mosquito activities in situ and mapping their distribution remain a challenge (2627). Here, we demonstrate that laser radar (lidar) surveillance (28) is a new and promising tool for characterizing fine-scale spatial and temporal distributions of malaria mosquitoes and for mapping their activity patterns across landscapes.

Experiment and field site

We set up a lidar of Scheimpflug type (29), sampling at 3.5-kHz repetition rate at the 808-nm near-infrared band at the periphery of Lupiro village, in the Kilombero Valley, south-eastern Tanzania [8°23′3.74′′S, 36°40′26.66″E, 308 m above sea level (ASL); Fig. 1] to evaluate the activity and fine-scale spatiotemporal distribution of malaria vectors during the 2016 pan-African solar eclipse. The main goal was to quantify diurnal variations of activity in wild malaria vector populations during the eclipse to assess any changes in these activities and associated risk of biting during eclipse events. A 596-m lidar transect propagated 3 to 5 m above the ground was used to assess different taxonomic groups and sexes of mosquitoes. A total of 312,191 insects were detected over the 5-day observation period.

Fig. 1 Views from above.(A) The pan-African annular eclipse on 1 September 2016, as seen at 11:45 local time from the geostationary Meteosat-10 orbiting at an altitude of 35,786 km. The location of the lidar is indicated with an arrow. (B) A close-up aerial view of the agricultural patches and the periphery of Lupiro village with the lidar transect indicated by the red line.

Several malaria vectors are found in the area, including Anopheles arabiensisAn. funestusAn. coustani, and An. rivolurum (30). A total of 3124 anopheline mosquitoes were caught in miniature light traps placed beside occupied bed nets in village houses between 18:00 and 07:00. Anopheles gambiae complex was the most abundant taxon sampled (table S1). Genetic analysis at Ifakara Health Institute of caught individuals confirmed that 98% of the An. gambiae complex collected was A. arabiensis; thus, this species dominated human biting indoors. The estimated mean biting rate was hundred bites per night for an unprotected person lacking a bed net (in general, villagers sleep under nets).


Results from entomological lidar

We were able to estimate modulation spectra and identify individual insects with beam transit times exceeding 23 ms (see fig. S2 in the Supplementary Materials and examples in Fig. 2). The fundamental tone, range, and time are given in the figure legends. Species identities and sex were inferred for the most abundant taxa in the area during the campaign based on previous laboratory wingbeat frequency estimates (3135). No Aedes spp., which have distinct frequency ranges, were captured, and Aedes is scarce in this location and season.

Fig. 2 Three examples of mosquito observations during the eclipse.The figures display backscattered intensity as a function of time, range, and frequency. The frequency is color-coded, whereby the mosquito pitch appears red. The nonoscillatory backscatter from the body appears yellow. Atmospheric turbulence is seen as a blue surface in the bottom of the plots. The corresponding time signal is shown on the back of the figure with an arbitrary scale, but the peak cross section is in the legend. The frequency content is projected onto the left panes. The three cases are (A) 430-Hz presumed Anophelesfemale also displaying an overtone and (B) 590-Hz and (C) 650-Hz presumed males of either Anopheles (probably An. arabiensis) or Culexspp.

Robust identification of the fundamental wingbeat frequency is challenging (36) and referred to as the pitch detection problem. The fundamental tone is not necessarily the strongest (3738), and glossy wings can have very high numbers of harmonics (39). To detect changes in the relative abundance of different mosquito taxa, we therefore estimated continuous modulation power spectra for each observation in 40 frequency bins, linearly spaced from 85 to 850 Hz. The average modulation spectra of the aerofauna changes over the course of the day, and Fig. 3 illustrates the reproducibly distinct frequency content of the monitored atmosphere. The modulation power spectra differed throughout the day but were highly consistent between days, including the day of the eclipse. Whereas the low-frequency part of the spectrum (<300 Hz) is, in general, crowded by the broad ensemble of active species, the higher range of frequencies (>300 Hz) associated with mosquitoes displays a significant increase during the eclipse with respect to the average and spread during ordinary daytime hours. The increase occurs although all observations are averaged in intervals (the contrast between mosquito wingbeats and overtones from other species is expected to be less in Fig. 3 than the actual case). This is contrary to the ordinary temporal niches of both Anopheles and Culex mosquitoes.

Fig. 3 Average modulation power spectra of the aerofauna during different times of the day.While the power spectra differ with the time of day, distributions for the same time intervals are consistent between consecutive days (same color graphs). During the eclipse on 1 September, the high-frequency content displays maximal power in the range associated with mosquitoes. In the evening, following the eclipse, reduced activity is displayed around 300 Hz before and after sunset. This is later compensated for during the night between 2 and 3 September. The absolute number of observations is stated in the legend in thousands per hour, and we note that the activity during the eclipse reaches a third of that of a normal evening rush hour. Approximate frequency ranges of Anopheles and Culex wingbeats are indicated underneath the power spectrum curves for both sexes. The gray arrows provide rough indications of possible frequency shifts from temperature and mosquito biomass payload. (For methodological and interpretational details, see the Supplementary Materials.) In addition, the lidar full width at half maximum (FWHM) resolution for a 23-ms time window is indicated.

Hierarchical clustering and classification

Here, we demonstrate the first remote classification of discrete mosquito groups in wild populations across a natural landscape. To classify observations into taxonomic groups and ultimately identify when important malaria vectors are most active, we sorted observations in groups according to modulation similarity with hierarchical clustering (see the Supplementary Materials). The first 20 branches of the hierarchical dendrogram were inspected and interpreted. The inspection and interpretation of each cluster were primarily based on comparing the centroid modulation spectra and the within-cluster variance to previous laboratory recordings of modulation spectra of mosquitoes (3134). The centroid spectra of the interpreted clusters are shown in Fig. 4. In several clusters, no significant wingbeat oscillations could be identified, given the extent of the within-cluster variance. Because these clusters had a larger backscatter cross section than the others, we refer to them as larger (N:21628, 9%). In the remaining centroid spectra, the fundamental wingbeat frequency was identified. In some clusters, harmonic overtones are folded at the Nyquist frequency, which can result in deformed tones or tones appearing in the wrong order. Identification of the fundamental tones was confirmed, also considering the position of multiple harmonics. Several of the clusters display fundamental tones in the range of 100 to 200 Hz, which is below mosquito wingbeats, and they are thus referred to as low-frequency insects (N:103631, 44%). Clusters displaying fundamental frequencies in the range of 340 to 440 Hz were attributed to female mosquitoes (N:20625, 9%), and clusters with fundamental frequency in the range of 560 to 810 Hz were attributed to male mosquitoes (N:35265, 15%). Other clusters were named according to temporal and spatial features. These include twilight insects at 220 Hz (N:5809, 3%) that become active just before dusk, bright large (N:26106, 11%) without wingbeat frequency, and a large cross section without associated larger apparent size (40), morning insects (N:20596, 9%) with wingbeat frequencies in the range of 250 to 290 Hz, particular morning activity and minor aggregates in the range domain. Last, we observed no presence of vertebrate predators, which can be frequent during mosquito swarming (40). All groups decrease with the range due to detection sensitivity limitations, but female mosquitoes were detected comparatively more frequently closer to the village, whereas low-frequency insects were detected further from the village and male mosquitoes displayed an intermediate range distribution. The hierarchical clustering was also carried out in various alternative ways using linear powers, logarithmic frequency bins, or weighting by frequency, whereby clusters changed in size and order. However, the conclusions following cluster reinterpretation remain consistent.

Fig. 4 The observations were grouped in 20 clusters by HCA by considering the resemblance of the modulation spectra.The top part displays the dendrogram and relative cluster assignment fractions. The bottom part displays the 20 centroid spectra and within-group variances. The power scale is the same for all subfigures. The fundamental tones and the first three overtones are indicated. The overtones are subjected to the folding artifact, see, e.g., C5, and in some cases, e.g., C6, the fundamental and second harmonics are ambiguous. We have indicated our interpretation of the cluster in each subfigure. The interpretation is based on features in the centroid spectra neglecting details within the group variance. Other particular features in range and time were used to assign some of the low-frequency clusters. The ambiguous cases C6, C14, and C15 were not included in the subsequent analysis of mosquito activity patterns.

Activity patterns

By monitoring overall insect activity over the course of the day, we identified as the main feature a spike in activity in the very late afternoon (Fig. 5A). This evening activity peak is bimodal, with a minor peak around 18:00 followed by a dominant peak at 18:50, 19 min after sunset, which is consistent with previous reports (4142). The second most dominant feature is a morning activity peak at 6:16, 17 min before sunrise. The shape of this peak is skewed toward later hours, particularly for female mosquito activity. These are likely to be females leaving the village in search of oviposition sites or sugar. Third, there is an exponential increase in insect activity during the day from 10:00 to 16:00, which is not observable on the linear scale in Fig. 5. Last, at night, the activity of small insects decreases exponentially from dusk to dawn, but this pattern is less clear for larger insects. Morning activity peaks are only 18 min long, and the main evening peak lasts for 21 min [full width at half maximum (FWHM)]. The precise and reproducible timings of activity within minutes presented here from Tanzania are consistent with previous reports from agricultural patches in China (40).

Fig. 5 Time-dependent probability of observation of various insect groups (probability of observing each group during one diurnal cycle is set to 100%, the absolute and relative abundances are given in the previous figure).Sunlight level measurements are indicated in gray in the background (the fluctuations are due to patchy clouds). (A) Normal days (average of 2 to 5 September 2016). (B) The day of the eclipse (1 September 2016). The intervals from Fig. 3 are indicated at the top of the figures. The insect grouping interpretations are based on selected classifications in Fig. 4 and denominated with identical labels. Uncertain groups are not included.

Effects of eclipse

At the study site, the annular solar eclipse on 1 September 2016 started at 10:08 and ended at 13:52, with a maximum obscuration at 11:58, blocking 96% of the sunlight. The median activity during the eclipse can be compared to the medians and interquartile spreads for the following days (Table 1). This eclipse caused significant increases in activity levels for both male and female mosquitoes. The male activity increased 87 times and females increased 7.4 times, in both cases clearly exceeding the statistical spread for ordinary days several times (Fig. 5B and Table 1). Twilight species responded strongly to the eclipse, whereas larger insects remained unresponsive. The eclipse had no apparent effect on the timing of the activity in the following evening or morning, but the activity was lower in the evening of 1 September (see Table 1). The overall frequency content in Fig. 3, reflecting species composition, does not deviate on the evening of the eclipse.Table 1Insect activity of the various groups during ordinary days compared to the day of the eclipse.

Upper values are medians, and indicated spread is an interquartile range (IQR). The environmental conditions are given in the right columns. The selected hours have equivalent light levels to the eclipse.

Embedded Image

From the nighttime light trap data (see table S1), there were no noticeable changes in the numbers of species caught. In general, the catch volumes fluctuated widely, and it may be difficult to assign the effect to the eclipse. The number of An. gambiae seemed unaffected for the evening after the eclipse of 1 September but appears to display a decrease in the evening of 3 September. The less abundantly caught mosquito species displayed peak counts for the evening after the eclipse.

Conclusion and perspective

Here, we showed that lidar technology enables the detection and classification of malaria vector mosquitoes, as well as quantification of their activities, in the wild populations in a rural African landscape. The high wingbeat frequency of mosquitoes makes them easy to distinguish against low-frequency flyers such as moths, flies, midges, and bees. Moreover, classification by sex is feasible in the field because male and female mosquitoes differ in wingbeat frequency (3133). We found that the activity is highest during well-defined morning and evening “rush hour” periods, with important implications for temporal targeting of actively delivered adulticide interventions such as space spraying (22). Moreover, we demonstrated a significant increase in activity during the eclipse for both male and female mosquitoes, which was particularly strong among males. Since the lidar could currently not elucidate possible host-seeking behavior of females, we remain unsure whether the eclipses could raise the risk of malaria infection at unexpected hours of the day. The observation that altering light levels affect flight activity in situ, however, opens up opportunities for novel laboratory experiments and the development of possible light-based control measures.

While this study successfully classifies some mosquito groups, higher sampling rates would further enable capture of the harmonic overtones of mosquitoes and avoid sampling artifacts such as beating and folding. Capturing overtones would also improve the precision of fundamental tone estimates, which might provide the precision needed to recognize blood-fed and gravid mosquitoes by their frequency shifts (43). The available power of laser diodes increases, and faster sensors enter the market every year. Recent laboratory studies have further demonstrated the determination of mosquito age (44) and infectious state (45) by infrared light. These bands could, in principle, be implemented in lidar in situ monitoring in the future. We conclude that entomological lidar is a groundbreaking new tool for estimating infection risks and identifying malaria vectors in flight over a considerable distance and entomological ecology in general, offering many new opportunities for in situ studies of wild populations in real ecosystems. Future research will provide many improvements to this technology, through developments in photonics and signal processing and integration into ecological research questions.


Field campaign

Field monitoring was carried out in the vicinity of Lupiro village during the middle of the dry season, 31 August to 5 September 2016, including during the pan-African solar eclipse. There were no precipitation or wind (<1m/s). The weather was relatively stable with patchy clouds. Light smoke from cooking and agriculture appears in the lidar signal but was removed by data processing. Temperature minima of ~19°C occurred at 6:00 and ~28°C maxima at 16:00. Tanzania is at UTC +3. At the site, during the campaign, sunrise occurred at 6:34, solar noon occurred at 12:32, sunset occurred at 18:31, and true midnight was at 0:32. On 1 September, moonrise was at 6:27, and moonset was at 18:40, with a back-lit new moon of 0% illumination at 392,965-km distance. On 4 September, moonrise was at 8:34, and moonset was at 21:00 with 9% illumination at 402, 322-km distance.

Lidar transect

The lidar transect was terminated on a black neoprene sheet at a distance of 596 m (8°22′44.93′′S, 36°40′31.39″E, 306 m ASL). The beam propagated 3 to 5 m above the ground and intersected elevated footpaths at 85-, 124-, 152-, 230-, 357-, and 450-m range. The instrument was deployed in a hut at the periphery of the Lupiro village (8°23′3.74′′S, 36°40′26.66″E, 308 m ASL), overlooking the nearby agricultural patches, and was powered by a small 2-kW motor generator throughout the experiment (see Fig. 1 and fig. S1).

Lidar configuration

A lidar with Scheimpflug configuration with infinite focal depth (2937404649) was used. This enables very high sample rates for modulation analysis of remote insect targets (313350). Monitoring was performed with an invisible near-infrared beam, transmitted from a 3-W laser diode emitting at a wavelength of 808 nm with vertical polarization. The average reflectance of aerofauna, including Anopheles, in this spectral region is around 20% (405152). The beam is expanded by a f600mm, ø120mm refractor and focused into a 2.5 cm by 23.3 cm line at the distant termination target (toothpaste-shaped beam). With the F/5 beam expander in this study, the coupling efficiency is ~25%. Total emitted intensity is estimated as 400-mW continuous wave (CW) or 457-μJ pulse energy. The intensity at the aperture is a 3.5 mW/cm2 CW or pulse energies of 4 μJ/cm2. Intensity at the termination is 17 mW/cm2 CW or pulse energies of 19 μJ/cm2. The MPE (maximum permitted energy) is 4 μJ/cm2 at 808 nm and 300 μs for pulsed consideration and 20 mW/cm2 at 808 nm and 300 μs for CW consideration. The beam is not considered eye-safe. The beam is inaccessible to the general public and has a minimum altitude over the ground of 3 m. A surveillance camera with near-infrared filter and sensitivity confirms the beam stability at the termination target at all times.

The backscattered light was collected by a f800mm, ø200mm Newtonian reflector telescope. The baseline transmitter-receiver separation was 814 mm and positioned vertically. After being transmitted through an 808-nm, 10-nm FWHM band-pass filter, light is detected with a linear 2048-pixel CMOS (complementary metal-oxide semiconductor) detector in Scheimpflug configuration at 45° tilt. The sensor was operated at a 3.5-kHz line rate, and the laser was modulated on even and odd exposures for sunlight subtraction throughout the day.

Data, calibration, and estimation of target modulation power spectrum

Raw data were saved in frames with 35,000 exposures spanning 10 s, each file with 140 Mb. The entire dataset was stored on several USB3 hard drives and was approximately 6 Tb. The dark or background exposures were identified by comparing even and odd exposures and hereafter subtracted (synchronized lock-in detection). In Fig. 2, three fractions of the raw data display three particular observations of active mosquitoes during the eclipse.

Range was calibrated on the basis of triangulation and precise knowledge of the mechanical details of the lidar system components, global position coordinates (GPS) of both lidar and termination target, and identification of termination echo pixel on the linear sensor. The range resolution deteriorates linearly by range, and the precision is roughly 3%. The main source of this uncertainty is the beamwidth. For details, see (295354).

Backscatter cross sections are derived by backscattered light intensity. It is the product between geometrical projected area and reflectance, σt = At*Rt, where reflectance is relative a Lambertian white target. This quantity was calibrated by knowledge on the reflectance of the neoprene termination target (1.8% Lambertian reflectance at 808 nm) and estimation of the beam and pixel footprint at the target. Following this fixpoint for the backscatter cross section, insect backscatter cross sections were calibrated by back extrapolation toward the lidar with the assumption of 1/r2dependence and homogeneous atmosphere. For details, see (47). The quantity shown in fig. S3 is the time median during each observation. Mosquitoes constitute the smallest classes, and the values roughly correspond to previous estimates (3351).

An alternative size measure, unrelated to reflectance, is the apparent size. This quantity is derived from the product of distance and opening angle (pixel spread). Remote small insects constitute point scatter sources, whereby the point spread function of the telescope is observed. However, larger insects at close range constitute an opening angle in conjunction with the beamwidth. This causes observations to spread over various pixels on the sensor. Since the distance is provided by the lidar, this opening angle can be converted to millimeters. For details on size estimations, see (4047).

Data were divided into static signal and rare constituents by descriptive statistics, and insect observations exceeding a signal-noise ratio (SNR) of 2 were selected. This implies that the expectation value of static atmospheric range-dependent echo is estimated by the temporal median for each 10-s frame. The static echo originates from aerosols, moisture, and molecules in the air. The static echo also reflects the range-dependent overlap function of laser intensity illuminating each pixel footprint. To filter out occasional plumes from burning in agriculture and cooking, the temporal median was calculated from the entire measurement period. In analogy with the temporal median, the instrument noise was estimated by the temporal interquartile range (IQR). The noise is predominantly readout noise but also contains dark-current background noise from sunlight and atmospheric turbulence. This noise is symmetric and displays a Gaussian distribution. This implies that the noise amplitude is 2.355 times the SD and 3.491 times the IQR. Insect observations of an SNR of >2 implies that the maximal signal strength from an insect observation exceeds two noise amplitudes for the given range; for further details, see (48). We counted 312,191 observations throughout the study period, where we define an insect observation as a connected patch of pixels and exposures in the range time frame. For the purpose of classifying insects and estimating their modulation power spectrum, we selected observations exceeding the most common beam transit time, which was 23 ms in this experiment (see fig. S2). Hereafter, 233,660 observations (75%) were considered for further analysis.

The power spectral density was estimated by Welch’s method in MATLAB (MathWorks, USA). We used a Gaussian time window of 23 ms FWHM to avoid side lobes. The window vector had 41 elements, since an odd number of elements is needed for minimum width and symmetry. Modulation power was estimated in 41 frequency bins spaced 22 Hz between bin centers. The minimal observable frequency was 44 Hz (one period during 23 ms), and the Nyquist frequency was 875 Hz (half sampling rate after background subtraction and a quarter of the sensor line rate). We identified anomalies in the outermost frequency bins, and presumably, this relates to frequency folding, bin centers, and edges of the implementation of the Welch method. The outermost frequency bins were therefore excluded from the analysis. This left us with spectral power density estimates in 39 frequency bins from 65 to 854 Hz (included in data file S1).

We synthesized a 10-s sinusoid, coinciding with the central frequency bin center, and fed it through identical Gaussian window width based on most likely transit times of 23 ms. This yielded a spectral resolution of 83 Hz FWHM comparable to the Rayleigh resolution criterion. This resolution estimate is added to Fig. 3.

Clustering and interpretation

Each modulation spectrum was normalized by a corresponding pink noise model (noise decrease with frequency). The models have the form log(Pnoise) = k0 – k1f, where Pnoise is the power spectrum of adjacent empty data and the two model coefficients, k0 and k1, are found by regression. Noise models were representative for each observation with respect to range and time of the day. The normalized modulation spectra of all observations were arranged in a 233,660 × 39 matrix, powers were logarithmized, and for the purpose of hierarchical cluster analysis (HCA), Euclidean distance was calculated for all observation pairs in a 39-dimensional space (linkage). The Euclidean distance of logarithmized values implies a logical conjunction operation, where simultaneous contents of all frequency bins in a cluster are similar. The pairwise linkage computation is demanding but was accomplished in MATLAB (MathWorks, USA) using the “ward” and “savememory” algorithm flags. We visualized the first 20 branches of the dendrogram (Fig. 4), by plotting the centroid modulation spectra for each branch, using the within-group median spectrum (Fig. 4). The spread was visualized by the within-group IQR. Details within the group spread were not interpreted. The identity of the clusters was primarily inferred by the fundamental frequency. The fundamental tones and their harmonics were subjectively added to the graphs by an experienced examiner. In some cases, the fundamental tone was ambiguous because of sample folding, and these cases were omitted from further interpretation. Mosquitoes, however, have a particularly high wingbeat frequencies, and frequencies of the possible target sex and species are available from the literature (313555). The associated mosquito species were selected among the species present in the area at the time of the experiment (see the survey in the next paragraph). The frequency ranges and FWHM were added to Fig. 3. Coarse values for temperature shifts for morning and evening temperatures were estimated (34). Coarse estimations for frequency shift due to the payload for blood-fed and gravid mosquitoes were also estimated (435556). The various shifts are indicated in Fig. 3. Apart from frequency content, some clusters in Fig. 4 were labeled according to particular behavioral features such as the daily pattern (twilight/morning insects) or range distribution (far insects), as well as signal properties such as the backscatter cross section (large) or the reflectance (bright) (see fig. S3).

Previous laboratory studies of photonic mosquito classification (3133) have reported high accuracies in the range of 80 to 95% using Naïve Bayes Classifiers (NBCs). However, these studies only include a few species, and the accuracy diminishes when similar species are included in the studies. In this in situ study, a high number of species can be expected to intercept the probe volume, and the true identity of each observation is not available. To provide some clues on the possible overlap of the clusters in Fig. 4, independent Gaussian-distributed modulation powers in the 39-dimensional space of frequency bins were assumed. Since Gaussian distributions range from minus infinity to plus infinity and since modulation powers are positive definite, then modulation power was logarithmized. This is in analogy with the logarithmic powers fed to the HCA in this study. The NBC tool in MATLAB (MathWorks, USA) was used to fit 39-dimensional Gaussian distributions to the logarithmized modulation powers of each of the 20 clusters in Fig. 4. The same toolbox was used to evaluate the confusion matrix and Gaussian overlap between the clusters (see fig. S4). The estimated accuracy and overlap of the hierarchical clusters are seen in the right pane. Many of the clusters, which were interpreted similarly, display the largest confusion; therefore, the accuracy for the grouped clusters is better than accuracies for the clusters. One observation is that NBC performs worse than random for the cluster C4 (female mosquitoes). It is concluded that clusters identified by the HCA are not necessarily Gaussian distributed, they do not necessarily have independent variables, and they are not necessarily spherical or ellipsoids in the 39-dimensional frequency space. This is in accordance with our previous understanding (37). The frequency content and overtones from a single species and sex are governed by spherical functions according to the observation angles; therefore, modulation powers across the frequency domain are neither independent nor linearly related, even for the same species and sex. Figure S4 therefore also largely reflects differences between HCA and NBC rather than actual accuracy of the clustering.

Comparisons to mosquitoes detected in homes

The indoor densities of the local malaria mosquitoes were measured throughout the course of the experiment using miniature CDC (Center for Disease Control, USA) light traps. These traps were placed indoors to provide a daily level of host-seeking mosquitoes. Three houses from the village close to the lidar equipment (within ~50 m, at the beginning of the lidar transect) were selected. The traps were placed beside occupied bed nets on the foot side. Humans sleeping inside the bed nets acted as bait for human-seeking mosquitoes. The traps were turned on at 18:00 and emptied the following mornings at 6:30. In a small laboratory, located in the Lupiro village, the collected mosquito samples were identified to species level based on morphological characters following the criteria of Gillies and Coetzee (57). Specimens identified as members of the An. gambiaecomplex (58) or An. funestus group (59) were identified to species by PCR (polymerase chain reaction). Blood meal analysis was performed by ELISA (enzyme-linked immunosorbent assay). Caught individuals were unfed, but the results are not representative for free-flying mosquitoes in general, and therefore, this analysis is not considered in this study.