Stanislav V. Pravytskyi et al. CEUR Workshop Proceedings                                                                                              80–88


                         Designing and evaluating an affordable Arduino-based lie
                         detector prototype
                         Stanislav V. Pravytskyi, Pavlo V. Merzlykin and Alexander N. Stepanyuk
                         Kryvyi Rih State Pedagogical University, 54 Universytetskyi Ave., Kryvyi Rih, 50086, Ukraine


                                     Abstract
                                     Lie detection is an important issue in various contexts ranging from criminal investigations to hiring processes.
                                     The paper covers the prototyping and evaluation of an affordable Arduino-based lie detector that integrates
                                     physiological sensors and machine learning to detect deception. Testing on 20 questions showed the detector
                                     achieved 55% accuracy in identifying truth and 45% accuracy in identifying lies, with an overall accuracy of 50%.
                                     While further refinements are needed, this prototype demonstrates the challenges of developing an accessible
                                     lie-detection system.

                                     Keywords
                                     lie detection, Arduino, physiological sensors, machine learning, LSTM, neural networks


                         1. Introduction
                         Lie detection has been a topic of fascination and study for over a century. From early attempts [1] to
                         modern functional magnetic resonance imaging (fMRI) techniques [2], researchers have sought reliable
                         methods to discern truthful statements from deceptive ones.
                           As a result of amateur electronics platforms affordability, some DIY lie detector prototypes were
                         reported recently [3, 4]. They usually use Arduino, which is one of the most popular amateur platforms,
                         and mostly cover data collection without sufficient processing. In this paper we are trying to estimate
                         the accuracy and feasibility of such DIY lie detectors combined with machine learning techniques.


                         2. Literature review
                         Polygraph tests, which measure multiple physiological indicators, became the dominant lie detection
                         tool in the 20th century [1, 5]. However, these tests have been criticized for their lack of scientific
                         validity, vulnerability to countermeasures, and inadmissibility as legal evidence [6, 7, 8].
                            In recent years, researchers have turned to neuroscience tools like fMRI [2]. These studies suggest
                         that certain brain regions, such as the prefrontal cortex, are more active during lying than truth-telling.
                            Another emerging trend is the use of machine learning. Recent work [9] has applied deep learning
                         algorithms and reported 57–63% accuracy, which leaves a gap for further research.
                            Alongside scientific developments, ethical and legal debates surround lie detectors. Critics argue
                         that polygraphs and other lie detection technologies are unreliable, violate privacy rights, and may
                         be misused or overinterpreted in high-stakes contexts like criminal investigations and employment
                         decisions [10]. In the US, the Employee Polygraph Protection Act of 1988 prohibited most private
                         employers from using lie detectors. However, the technology is still widely used in government and
                         law enforcement settings [11].


                          CS&SE@SW 2024: 7th Workshop for Young Scientists in Computer Science & Software Engineering, December 27, 2024, Kryvyi
                          Rih, Ukraine
                          " coffitronak@gmail.com (S. V. Pravytskyi); ipmcourses@gmail.com (P. V. Merzlykin); alexanderstepanyuk@gmail.com
                          (A. N. Stepanyuk)
                          ~ https://kdpu.edu.ua/personal/pvmerzlykin.html (P. V. Merzlykin); https://kdpu.edu.ua/personal/omstepaniuk.html
                          (A. N. Stepanyuk)
                           0000-0002-4017-7172 (P. V. Merzlykin); 0000-0001-9088-2294 (A. N. Stepanyuk)
                                     © 2025 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings

                                                                                                            80
Stanislav V. Pravytskyi et al. CEUR Workshop Proceedings                                          80–88


   While researchers continue to develop and test new methods, fundamental questions persist about
the accuracy, validity, and ethics of lie detectors. The current study aims to advance this field by
designing an Arduino-based lie detector that integrates physiological measurements with machine
learning analysis. By examining the potential and limitations of affordable lie detectors, we hope to
contribute to the ongoing dialogue.


3. Methods
3.1. Hardware
The lie detector prototype was built using an Arduino UNO development board, chosen for its afford-
ability and extensive community support. The following peripherals were connected to the board:

    • SHT20 temperature and humidity sensor
    • Pulse sensor to measure heart rate
    • ADS1115 16-bit ADC to increase measurement precision

  The schematics is shown in figure 1.

3.2. Software
Three main software components were developed:

   1. Arduino sketch to read sensor data and print it to the serial port
   2. Data collection program to save sensor readings along with truth/lie labels
   3. Machine learning model to classify data sequences as indicating truth or lies

   The Arduino code initializes the sensors and reads temperature, humidity and pulse values at regular
intervals. The data is printed to the serial port to be processed on the computer. Key sections of the
code are shown below.
   In setup() function, which is executed on startup, we initialize all the interfaces. We use dedicated
libraries to handle both SHT20 and the pulse sensor. Serial connection is established to transfer the
collected data outside.

//Libraries to handle the interfaces
#include <Wire.h>
#include "DFRobot_SHT20.h"
#include "PulseSensorPlayground.h"

DFRobot_SHT20 sht20; //SHT20 instance
PulseSensorPlayground pulseSensor; //pulse sensor instance

const int PulseWire = A0; // Pin for Pulse Sensor
const int Threshold = 525; // Threshold value for Pulse Sensor

void setup() {
  Serial.begin(9600); //establishing serial connection

  // Initialize Pulse Sensor
  pulseSensor.analogInput(PulseWire);
  pulseSensor.setThreshold(Threshold);

  // Initialize SHT20


                                                     81
Stanislav V. Pravytskyi et al. CEUR Workshop Proceedings                                      80–88


Figure 1: The schematics of Arduino lie detector.


    sht20.initSHT20();
    delay(100);
    sht20.checkSHT20();
}

   Unlike setup() function, loop() subroutine is executed repeatedly while the device is powered on.
It collects sensors and sends it via serial interface for further processing.

void loop() {
  //read pulse sensor data
  int myBPM = pulseSensor.getBeatsPerMinute();
  pulseSensor.sawStartOfBeat(); // Update pulse status
  int pulseValue = analogRead(PulseWire);


                                                     82
Stanislav V. Pravytskyi et al. CEUR Workshop Proceedings                                         80–88


    // Read SHT20 data at set interval
    if (currentMillis - lastSHT20ReadTime >= SHT20Interval) {
      lastSHT20ReadTime = currentMillis;
      lastHumd = sht20.readHumidity();
      lastTemp = sht20.readTemperature();
    }

    //send collected data to serial port
    Serial.print(lastTemp, 2);
    Serial.print(",");
    Serial.print(lastHumd, 1);
    Serial.print(",");
    Serial.println(myBPM);
}

    The data collection script is written in Python and consists of two subroutines.              The
read_serial_data() functions reads messages from serial port and returns them.

def read_serial_data():
    line = ser.readline().decode('utf-8', errors='ignore').strip()
    if line:
        try :
             temperature, humidity, pulse = map(float, line.split(','))
             return temperature, humidity, pulse
        except ValueError:
             return None
    return None

  The collect_data() function asks the user to label each set of collected data as true (1), false (0),
or test mode (3). In normal mode, 100 lines of labelled sensor data are saved to a CSV file for each
prompt. Test mode collects 2000 lines of unlabeled data for evaluating the model. Key functions are
shown below:

def collect_data(filename):
    while True:
        data = []
        label = None
        test_mode = False

           print("Start collecting data. Press '1' for truth, '0' for lie, " +
           "'3' for test.")
           label_input = input("Enter label (1-truth, 0-lie, 3-test): ")
           if label_input in ['1', '0']:
               label = int(label_input)
               flush_serial()
               print(f"Start recording data with label {label}.")
           elif label_input == '3':
               test_mode = True
               flush_serial()
               print("Test mode active. Collecting 2000 lines of data.")
           else:
               print("Invalid input. Try again.")


                                                     83
Stanislav V. Pravytskyi et al. CEUR Workshop Proceedings                                         80–88


                 continue

           while True:
               sensor_data = read_serial_data()
               if sensor_data:
                   temperature, humidity, pulse = sensor_data
                   data.append([label, temperature, humidity, pulse] if not
                   test_mode else [temperature, humidity, pulse])

                 if test_mode and len(data) >= 2000:
                     print(f"Collected {len(data)} lines in test mode.")
                     data = []
                     test_mode = False
                     print("Test mode complete. To continue, press '1', '0',"+
                     " or '3'.")
                     break
                 elif not test_mode and len(data) >= 100:
                     df = pd.DataFrame(data, columns=['Label', 'Temperature',
                                                      'Humidity', 'Pulse'])
                     with open(filename, 'a', newline='') as f:
                         df.to_csv(f, header=f.tell()==0, index=False)
                     print(f"Recorded {len(data)} lines with label {label}.")
                     data = []
                     label = None
                     break

   A long short-term memory (LSTM) neural network was implemented using the Keras library. The
model architecture consists of an LSTM layer with 50 neurons followed by a dense output layer with
sigmoid activation. It was trained on overlapping sequences of 100 sensor readings to predict the
probability that each sequence corresponds to a lie. An 80/20 train/validation split and early stopping
based on validation accuracy were used. The key model setup code is shown below:

def create_sequences(data, seq_length):
    sequences = []
    labels = []
    for i in range(len(data) - seq_length + 1):
        seq = data[i:i + seq_length][features].values
        label = data.iloc[i + seq_length - 1][label_col]
        sequences.append(seq)
        labels.append(label)
    return np.array(sequences), np.array(labels)

X, y = create_sequences(df, sequence_length)

model = Sequential()
model.add(LSTM(50, input_shape=input_shape))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy',
              metrics=['accuracy'])

checkpoint = ModelCheckpoint('best_model.h5',
                             monitor='val_accuracy',


                                                     84
Stanislav V. Pravytskyi et al. CEUR Workshop Proceedings                                             80–88


                                         save_best_only=True, mode='max')

history = model.fit(X_train, y_train,
                    epochs=50, batch_size=32,
                    validation_data=(X_val, y_val),
                    callbacks=[checkpoint])

  Finally, a real-time prediction program was developed to load the trained model, collect live sensor
data from the Arduino, and output truth/lie classifications. Predictions are made on rolling windows of
the last 100 sensor values. Key sections are shown below:

model = load_model('lie_detector_model.h5')

def predict_from_buffer(buffer):
    input_data = np.array(buffer).reshape(1, sequence_length, len(features))
    prediction = model.predict(input_data)
    return 1 if prediction[0][0] > 0.5 else 0

while True:
    user_input = input("Enter 'start' to begin data collection: ")
                 .strip().lower()
    if user_input == 'start':
        data_buffer = []

           # Clear serial buffer
           ser.reset_input_buffer()

           # Collect data
           while len(data_buffer) < sequence_length:
               if ser.in_waiting > 0:
                   data_line = ser.readline().decode('utf-8').strip()
                   temperature, humidity, pulse = map(float, data_line.split(','))
                   data_buffer.append([temperature, humidity, pulse])

           # When buffer is full, run test
           print("Data collected. Running test...")
           results = []
           for _ in range(100):
               result = predict_from_buffer(data_buffer)
               results.append(result)

           true_count = results.count(1)
           false_count = results.count(0)

           # Output final result
           if true_count > false_count:
               print("Truth")
           else:
               print("Lie")

   These software components work together to enable the Arduino lie detector prototype to collect
physiological data, analyze it using a trained machine learning model, and output lie/truth classifications
in real time. The system’s modular design allows for easy modification and extension of its capabilities.


                                                     85
Stanislav V. Pravytskyi et al. CEUR Workshop Proceedings                                             80–88


4. Results
The lie detector prototype was tested on 20 questions designed to elicit a mix of true and false responses:
   1. Is your name Stanislav?
   2. Are you 21 years old?
   3. Do you have a driver’s license?
   4. Are you standing right now?
   5. Can you play guitar?
   6. Is 2+2=5?
   7. Have you ever lied to your friends?
   8. Can you drive a car?
   9. Do you consider yourself an honest person?
  10. Did you ever get failing grades in school?
  11. Have you ever consumed alcoholic drinks?
  12. Do you like your university?
  13. Have you ever lied to your parents?
  14. Have you ever cheated at work?
  15. Do you consider yourself a kind person?
  16. Have you ever harmed other people?
  17. Are you satisfied with your appearance?
  18. Have you ever been to Kyiv?
  19. Do you consider yourself a happy person?
  20. Have you ever used someone else’s property without permission?
  The system accurately classified 55% of true statements and 45% of lies, for an overall accuracy of
50% (table 1).

Table 1
Lie detector accuracy on test questions.
                                             Measure                    Value
                                True statements correctly classified    55%
                                False statements correctly classified   45%
                                Overall classification accuracy         50%

  The results leave significant room for the accuracy improvement. Potential enhancements include:

    • Collecting a larger and more varied training dataset
    • Tuning the neural network architecture and hyperparameters
    • Incorporating additional physiological sensors
    • Personalizing models to each individual’s baseline physiology

   These results highlight both the promise and challenges of developing an affordable lie detection
system. With an overall accuracy of 50%, the current prototype performs similarly to the average human
lie detector [12].


5. Discussion
This work demonstrates that an inexpensive lie detector can be constructed by interfacing physiological
sensors with an Arduino microcontroller and applying machine learning to the collected data. However
the LSTM neural network’s accuracy (50% overall) is not yet sufficient for practical application.


                                                      86
Stanislav V. Pravytskyi et al. CEUR Workshop Proceedings                                            80–88


   It falls short of the claims made by polygraph proponents, who often report accuracy rates of 90%
[8], although these claims are highly controversial. However another machine learning approach is
reported to have 57% accuracy [9], which is comparable to our results.
   The Arduino prototype’s slight bias towards classifying statements as true (55% accuracy on truths
vs. 45% on lies) is consistent with the “truth bias” observed in human lie detection [13].
   Several limitations of the current study should be acknowledged. First, the test questions, while
designed to elicit a range of truthful and deceptive responses, may not fully capture the complexity
and motivation of real-world deception. The stakes in a laboratory setting are inherently lower than in
high-consequence contexts like criminal investigations or national security screenings.
   Second, the physiological measures used by the prototype (skin temperature, humidity, pulse) are a
subset of those typically collected by polygraphs, which also measure respiration and blood pressure.
Incorporating additional sensors could potentially improve the system’s accuracy.
   Third, the current prototype uses a single machine-learning model trained on data from multiple
individuals. Developing personalized models tailored to each individual’s baseline physiological re-
sponses could potentially improve accuracy, as prior work has shown that accounting for individual
differences can enhance deception detection [9]. However, collecting sufficient training data from each
user to develop robust personalized models would be a significant practical challenge.


6. Ethical challenges
All the mentioned gaps in the results return us to the ethical challenges surrounding lie detection. Since
physiological responses measured by lie detectors can be influenced by various factors unrelated to
deception, such as anxiety, fear, or medical conditions, it is doubtful that these devices should be used
in the cases which can have serious consequences for individuals’ lives and reputations.
   Moreover, persons subjected to polygraph testing may not fully understand how the test works,
which may affect the results. Ensuring that individuals are adequately informed is an important ethical
challenge for lie detectors research and practical usage.
   Privacy and personal data collection is another issue that should be considered in such tests. There is
always a risk that information obtained during testing could be used against individuals in ways they
did not anticipate.
   The potential lasting impact of undergoing a lie detector test on mental health and well-being is
another concern.
   Addressing these challenges requires a careful consideration of the implications for individuals and
society, as well as a commitment to ethical principles such as validity, informed consent, privacy, and
fairness.


7. Conclusion
The development of an Arduino-based lie detector prototype demonstrates the challenges for low-cost,
accessible DIY lie detection tools. The prototype’s performance, while not yet sufficient for practical
application, highlights the promise of this approach.
  Substantial improvements in accuracy, reliability, and generalizability will be necessary for such a
system to be viable for real-world use. This will likely require more extensive and more diverse training
datasets, more sophisticated machine learning models, and the integration of additional physiological
and behavioural measures. Personalized models that take into account individual differences may also
be a promising direction.
  At the same time, it is crucial to recognize that the challenges of lie detection are not purely techno-
logical. Even a substantially improved lie detector would still face fundamental questions about the
nature of deception, the ethics of its application, and its appropriate role in legal, commercial, and
personal contexts.


                                                     87
Stanislav V. Pravytskyi et al. CEUR Workshop Proceedings                                                               80–88


   Lie detector studies should take into account the mentioned ethical considerations and be designed
to minimize harm, provide benefits, and respect the autonomy of participants.
Declaration on Generative AI: During the preparation of this work, the authors used Claude 3 Opus in order to: Drafting
content, Text translation, Generate literature review, Grammar and spelling check, Content enhancement. After using this
service, the authors reviewed and edited the content as needed and takes full responsibility for the publication’s content.


References
 [1] G. C. Bunn, ‘Supposing that truth is a woman, what then?’: The lie detector, the love ma-
     chine, and the logic of fantasy, History of the Human Sciences 32 (2019) 135–163. doi:10.1177/
     0952695119867022.
 [2] E. Rusconi, T. Mitchener-Nissen, Prospects of functional magnetic resonance imaging as lie
     detector, Frontiers in Human Neuroscience (2013). doi:10.3389/fnhum.2013.00594.
 [3] BuildItDR, Arduino Lie Detector — projecthub.arduino.cc, https://projecthub.arduino.cc/BuildItDR/
     arduino-lie-detector-41f703, 2022.
 [4] S. Olfat, Arduino Polygraph Machine (Lie Detector) - ElectroPeak — electropeak.com, https:
     //electropeak.com/learn/arduino-lie-detector-polygraph-machine/, 2016.
 [5] C. A. Ruckmick, The truth about the lie detector, Journal of Applied Psychology 22 (1938) 50–58.
     doi:10.1037/h0059742.
 [6] W. G. Iacono, D. T. Lykken, The validity of the Lie detector: Two surveys of scientific opinion,
     Journal of Applied Psychology 82 (1997) 426–433. doi:10.1037/0021-9010.82.3.426.
 [7] D. T. Lykken, Psychology and the lie detector industry, The American psychologist 29 (1974)
     725–739. doi:10.1037/h0037441.
 [8] W. G. Iacono, Psychology and the lie detector industry: A fifty-year perspective, Biological
     Psychology 190 (2024) 108808. doi:10.1016/j.biopsycho.2024.108808.
 [9] N. Rodriguez-Diaz, D. Aspandi, F. M. Sukno, X. Binefa, Machine learning-based lie detector applied
     to a novel annotated game dataset, Future Internet 14 (2022). doi:10.3390/fi14010002.
[10] J. J. Furedy, R. J. Heslegrave, Validity of the Lie Detector: A Psychophysiological Perspective,
     Criminal Justice and Behavior 15 (1988) 219–246. doi:10.1177/0093854888015002008.
[11] V. Mellema, Lie Detector Tests, in: The Encyclopedia of Civil Liberties in America: Volumes
     One-Three, volume 2, 2015, pp. 567–568. doi:10.4324/9781315699868-398.
[12] T. H. Feeley, M. J. Young, Humans as lie detectors: Some more second thoughts, Communication
     Quarterly 46 (1998) 109–126. doi:10.1080/01463379809370090.
[13] T. R. Levine, C. N. H. Street, Lie-truth judgments: Adaptive lie detector account and truth-default
     theory compared and contrasted, Communication Theory 34 (2024) 143–153. doi:10.1093/ct/
     qtae008.


                                                             88