Testing AI Coded Apps – Challenges and Tips

AI tools like Lovable.dev are changing app development, enabling rapid prototyping and giving the power to everybody to create functional applications through natural language prompts.

These tools are 20x faster to code than a developer, but they also introduce unique challenges in testing, debugging, and maintaining the generated applications. When you add AI to the team, you need to be vigilant.

Let’s explore below some challenges, and common scenarios that can happen and how you can test and identify them.

If you want to be able to use the code as a boilerplate and escalate the product after, don’t add 300 features before checking and testing it! AI creates hundreds lines of code making it harder and harder to review and maintain, test and check the code early as possible.

Also be aware, they will use whatever library they think is the best or they have partnership with. (Example: Lovable.dev will push you to use supabase) and some of these libraries/tools might not be the best/cheaper for your product (Check subscription prices). These AI tools might use libraries that are deprecated creating a conflict with other dependencies as you scale, introducing other bugs.

If you want to just test the market, prototype and you are completely okay to might have this MVP rewritten from the scratch then no need to worry about too much.

Common Challenges in Testing AI Coded Apps

1. Code Quality and Optimisation

Scenario: An e-commerce startup uses Lovable.dev to build a shopping platform. The generated code includes a product listing feature but contains redundant database queries that degrade performance.

Generated Code Example:

// Generated by AI
let products = [];
for (let productId of productIds) {
    let product = db.query(`SELECT * FROM products WHERE id = ${productId}`);
    products.push(product);
}

Issue: The code queries the database inside a loop, resulting in multiple queries for a single operation.

If you only had a happy test scenario you wouldn’t be able to catch this one, so in this case you will need to actively check the database and it’s performance.

2. Limited Customization and Flexibility

Scenario: A nonprofit organization creates an event management app. The app’s AI-generated code fails to include the functionality to calculate the carbon footprint of events.

Generated Code Example:

// Generated by AI
events.forEach(event => {
    console.log(`Event: ${event.name}`);
});

Issue: The AI didn’t include a custom calculation for carbon emissions.

This is typical, sometimes AI only codes the front-end, some of the interactions between the components, and uses hardcoded the data, but it is unable to create the backend or logic behind if not explicitly asked for and send the formula. This can be catch in a simple happy test scenario with different inputs.

3. Debugging Complexity

Scenario: A small business generates a CRM app with an AI tool. The notification system malfunctions, sending duplicate notifications.

Generated Code Example:

// Generated by AI
reminders.forEach(reminder => {
    if (reminder.date === today) {
        sendNotification(reminder.userId, reminder.message);
        sendNotification(reminder.userId, reminder.message);
    }
});

Issue: Duplicate notification logic due to repeated function calls.

Sometimes even AI is able to pick up this one. You know when they suggest to refactor the code ? This one would be easy to catch when doing your happy path scenario, checking if you have received the notification only once.

4. Scalability Concerns

Scenario: A social media startup builds its platform. The AI-generated code fetches user data inefficiently during logins, causing delays as the user base grows.

Generated Code Example:

// Generated by AI
let userData = {};
userIds.forEach(userId => {
    userData[userId] = db.query(`SELECT * FROM users WHERE id = ${userId}`);
});

Issue: The loop-based query structure slows down login times for large user bases.

This one could be identified later in the development cycle, unless you are doing performance tests early on. Probably will catch this only when you have a large database of users, easy to fix, but can be fixed before you have this headache.

5. Security Vulnerabilities

AI coding is great when the stakes aren’t too high

Scenario: A healthcare startup generates a patient portal app. The AI-generated code stores sensitive data without encryption.

Generated Code Example:

// Generated by AI
db.insert(`INSERT INTO patients (name, dob, medicalRecord) VALUES ('${name}', '${dob}', '${medicalRecord}')`);

Issue: Plain text storage of sensitive information.

Another typical one for AI coded generated apps, usually they lack on security of the data. Be extra cautious when checking the data transactions and how the data is being managed and stored.

6. Over-Reliance on AI

Scenario: A freelance entrepreneur creates a budgeting app. When a bug arises in the expense tracker, the entrepreneur struggles to debug it due to limited coding knowledge.

Generated Code Example:

// Generated by AI
let expenses = [];
expenseItems.forEach(item => {
    expenses.push(item.amount);
});
let total = expenses.reduce((sum, amount) => sum + amount, 0) * discount;

Issue: Misapplied logic causes an incorrect total calculation.

Another one that AI can catch while developing the app, because AI mix back and front end code sometimes is hard to debug even when you are a experienced developer, for someone that doesn’t have coding skills, then the challenge can be a bit more complex. AI can also help you to find the error, and you can catch this one probably not only when deploying, but also when doing your happy path scenario.

Not all AI coding platforms create tests on their own code unless explicitly asked for. Loveable for example don’t create any tests for their code. This is another thing you need to keep in mind when using these tools.

Another point is AI is not really good to keep up to date with all latest technologies, for example: All Blockchains, still not possible to do much, but a matter of time maybe ? These technologies keep changing and evolving every second you breath, AI can’t keep up yet, and humans can’t as well 😂

Some tips to maintain AI Coded Apps

Conduct Comprehensive Frequent Code Reviews
Implement Testing Protocols
Train AI to use Code Best Practices
Plan for Scalability
Prioritise Security
Foster Developer Expertise

Human-AI Collaboration: Ministry of Testing London Meetup Recap

Last week I attended a face-to-face Ministry of Testing Meetup focused on guess what ? AI vs Testers: Friend or Foe? 🤖🧪 !

One of the key takeaways was the recognition that AI isn’t about replacing testers, but rather about increasing their abilities. While 1 or 2 people were concerned about job security, the consensus was that upskilling is crucial.

That’s why I always recommend people to follow emergent technologies. My first interaction with AI was 7 years ago, when I posted about machine learning in 2018 and also on this AI chatbot project that I joined just after.

Focus, learn, practice and stay calm, you are not going to be replaced by AI, maybe for people who use AI 🤷‍♀️

The future of testing lies in leveraging AI tools effectively, and those who adapt will thrive. The discussion highlighted core skills that will remain essential for long-term careers:

Clear Thinking: AI can analyse code, but human critical thinking and problem-solving are still key.
Passion for Quality: A genuine commitment to quality remains a uniquely human trait.
Adaptability: The tech landscape is constantly shifting. Embracing change and learning new technologies, like AI, is essential.

The meetup also talked about the limitations of current AI models. Bias in data sets, as highlighted by the Global Data Quality Report, remains a significant concern. We discussed how even sophisticated simulations, like a “simulated CEO,” struggle to replicate human personality and decision-making.

Testing AI: Challenges and Approaches

Testing AI itself has unique challenges, primarily due to the sheer volume of data involved. Some organisations are using automation with massive datasets, but careful scoping is essential. The human element remains crucial, especially at key decision points. It’s also important to remember that AI can still be “delusional” – producing unexpected or incorrect results.

Practical Advice and Considerations:

Some practical advices:

Don’t follow blindly: AI is powerful, but it’s not a silver bullet. Understand the value proposition before implementing it.
Be aware of the limitations: AI can slow you down and requires careful planning. Define clear objectives before you start.
Embrace thought leadership: Explore AI’s potential for strategic growth and innovation.
Research and be cautious: Don’t rely on a single model. Test with different datasets and diverse groups to ensure robustness.

Data and Privacy:

A crucial point raised was data privacy. Concerns were expressed about data being stored in the cloud without proper security measures. The importance of encryption and secure data handling was emphasised, with some companies exploring blockchain technology for data storage ❤️

The meetup reinforced my what I have being saying about: the future of testing lies in the synergy between human intelligence and AI tools. By effectively integrating human expertise with the capabilities of AI, we can achieve higher levels of quality and efficiency in software development. It’s about “mix brain and tool” – leveraging the best of both worlds.

EuroSTAR Conference 2024 – Stockholm

Hello, hello! A bit late as usual, but I’m here to share my experience at the Eurostar Conference this year. My talk was scheduled for 15:15 on Thursday, June 13th. Despite my initial anxiety, I managed to not only deliver my talk but also had time to attend other sessions and join two tutorials. Apparently, joining two tutorials was against the rules (shh 🤫)

The key highlights

Kick Ass Testing Tutorial

Finding basis path: Ensure effective control flow testing by identifying the basis path.
Draw diagram flow: Create a detailed flowchart diagram to visualize the process.
Flipping decisions on baseline: Adjust decisions based on the established baseline to improve accuracy.
Flow chart: Use flowcharts to map out the process and identify key decision points.
Control flow testing: Test the control flow of the application to ensure all paths are exercised.
Code exercise: Focus on exercising the code you wrote, not the code that wasn’t written.
Business path analysis with JPath: Tools like JPath may not suffice for business path analysis; use domain analysis and equivalence class partitioning instead.
Pairwise workflow: Employ pairwise testing to handle millions of possible tests, as it’s impossible to test everything.
User behavior focus: Ask what the user does to the application, not what the application does to the user.
Vilfredo Pareto principle: Apply the Pareto principle, noting that 20% of transaction types happen 80% of the time, and start with transaction history analysis.
Pairwise tools: Use tools like Allpairs and PICT for pairwise testing, they are quite old school tho. No mention on AI tools to help creating the data, found a bit weird ?!?
Data variation: Ensure multiple variations of data and a reasonable amount of data for thorough testing.

See the PDF below:

Tut A-Robert-Ben-Claudiu-Kick-Ass-Testing Download

What Are You Doing Here? Become an Awesome Leader in Testing

My favorite part was discussing the things we’ve heard throughout the years in the QA and testing industry. Some of them include:

Automate everything: Avoid unrealistic expectations like “automate everything” and ensure thorough testing to prevent missing bugs.
More test cases mean better testing: Quantity over quality in test cases can result in redundant tests that don’t effectively cover critical scenarios.
Just test it at the end: Believing that testing can be left until the final stages of development leads to overlooked defects and rushed fixes.
Quality is the tester’s job: Assuming that only testers are responsible for quality undermines the collective responsibility of the entire team.
We can catch all bugs with testing: Expecting testing to catch every possible defect overlooks the importance of good design and development practices.

Why AI is Killing – Not Improving – the Quality of Testing

This was the big one of the entire conference, largely due to the drama that unfolded at the end of the talk 🎭

I missed the point where the title resonated with the entire talk, and it was my fault for not reading the description and going just because of the title.

They compared the time it takes to build cars from ages ago to now (Ford and Tesla) and showed that it only saved 3 minutes. I’m not sure if they did this on purpose just to prove their point, but the comparison missed the complexity and features that have been added in the new cars, like the entire software and electric systems behind Tesla that didn’t exist before. These aspects weren’t considered in their comparison.

They also presented interesting analysis about when AI will catch up with human intelligence, as well as the gap that AI is creating between junior and senior developers. Not many people talk about this, but indeed, AI is a tool that can help us while also potentially making us lazy, similar to how calculators did; we still need to learn the basics

Basic Coaching Techniques for Emerging Quality Coaches

Active listening: It involves fully concentrating, understanding, responding, and remembering what’s being said.
Train yourself and learn: Continuously improving active listening skills through practice and feedback helps in understanding others better.
Circle of control: Focus on what you can control in conversations—your responses, understanding, and actions.
Feedback: Provide constructive feedback that helps the person improve without making them feel punished. Talk about the behaviour not the identity, don’t use BUT, use AND.
Keep questions simple: Use straightforward questions that facilitate understanding and encourage deeper thought.
Be present: Engage fully in the conversation, maintaining focus and showing genuine interest.
11k impressions: Recognize that perspectives can vary based on personal factors like fatigue and biases
Keep questions simple: Frame questions clearly to facilitate understanding and encourage exploration of solutions.
Acceptance: Reality gap ! Facts on the table. Easy ? No, necessary: yes
You have the questions not necessarily know the answers. Help them to figure out how to find a solution.
What are your three top values? Rank 1 to 10. This will help you and your mentee to connect.

QA Outsourcing: Triumphs, Trials, & Takeaways

Unfortunately, I couldn’t make this one as I was back to London, but I watched the video after and the main takeaways are:

Strategic move: Outsourcing QA can strategically optimize resources and expertise.
Drive success: Effective management of outsourced QA enhances product quality and market competitiveness.
Growth: Outsourcing allows scalability and focus on core business functions.
Competitive landscape: Leveraging external QA services brings agility and innovation to stay ahead in the market.

A Tester’s Guide to Navigating the Wild West of Web3 Testing

Here I am again, checking the feedback. As expected, the audience was quite different from the one I usually engage with. Since this conference is a bit more corporate, I didn’t anticipate too much variation in the audience. I was also extra nervous for this one, so instead of 45 minutes, I sped up and went into the fast lane, finishing the talk in just 30 minutes. I just gave you all some extra time for coffee! 😆

As always, I needed to gauge the Web3 knowledge level of the majority, and unsurprisingly, there is still a massive gap in education about what Web3 and Blockchain are. Thus, I spent a significant portion of my talk explaining these concepts.

The feedback is quite contradictory. Some people said it was hard to follow because no background was provided, while others mentioned they didn’t know the talk would focus solely on Blockchain (which it did not). 🤷‍♀️

So, if I give more background, people complain. If I reduce the background, people will still complain. My take on that is it’s really hard to please everyone; sometimes I can’t even make my own dog happy! 😄

I still try, though. So, thanks to those who gave constructive feedback ❤️!

I’ll work on improving for the next one 🚀

More random pictures with these great speakers whom I had the pleasure to meet, the cubic challenge, and also random exotic food talks on the boat party.

Using AI to Accelerate Test Automation

Hello hello peeps 👋

I have been a bit of a workaholic lately, but all for a good cause 😊

Not sure if you know already, but I started to work on a project The Chaincademy helping Developers (SDETs, Engineers, Coders, Programmers, Test Automation Engineers…), especially the junior ones that are coming to Tech to find their first job 💻

We have launched our MVP before Xmas, and we are testing it with our audience (Junior Developers). So, in case you want to accelerate your career (for now, only web3) and get your first experience as a developer, sign up for our Newsletter to get access 🎉

First time I actually adventured myself with AI and Machine Learning was back in 2018 in a Machine Learning Workshop. I had to create this iOS app where AI was replacing my face with an emoji based on my expressions 😆 Really simple, but back in that time, AI was not so good as it is right now (As we say in Brazil: Na minha época isso aqui era tudo mato – Back in my day, this place was all woods)

And since the launch of chatGPT to speed up all my work, I have been using AI on a daily basis, more Bard actually (Think it is much better than ChatGpt nowadays), so here I am going to give some tips on how I have been using it in test automation:

Test Automation

1. Test Case Generation:

Scenarios: You need to pass user stories and acceptance criteria, to generate corresponding test cases with detailed steps and expected outcomes.

Prompt Example: Given a user tries to register with an invalid email address, describe the steps they would take and verify that the system displays an appropriate error message.
Edge Cases: Ask to suggest potential edge cases or corner scenarios to test, ensuring comprehensive coverage of your application’s functionality.

Prompt Example: For the checkout process, what happens if the user's internet connection drops while entering their payment information? List potential scenarios and expected outcomes.
Data-Driven Testing: Generate test data sets based on specific criteria.

Prompt Example: Generate 10 test cases for the login feature, covering cases with valid and invalid username/password combinations and different user types (admin, regular user).

2. Coding:

Test Script Automation: Describe the test actions:

Prompt Example: I want to test clicking the 'Submit Order' button and verifying the order confirmation page appears. Write a Cypress with javascript script for this scenario.
Code Completion: Get test assertions, locator identification, and handling complex interactions.

Prompt Example: In my Cypress test, I'm trying to assert that the element contains the text 'Welcome back'. Please suggest the next line of code with assertion syntax.
Refactoring: Analyze your existing test scripts and suggest improvements like removing redundancy, increasing reusability, or optimizing execution time.

Prompt Example: Analyze my Pull request for the search functionality. Can you add comments and suggest ways to improve readability, reduce redundancy, and speed up execution?

3. Test Planning and Management:

Prioritization and Risk Assessment: Provide the test case details and application knowledge, so it can help prioritize tests based on risk or impact.

Prompt Example: Given these 20 test cases for the new feature, rank them based on potential impact, speed of delivery and risk of failure. Explain your reasoning for each.
Maintenance: Identify outdated or irrelevant test cases and suggest updates or new tests to maintain coverage.

Prompt Example: The application updated its login page layout. Identify test cases needing modification and suggest relevant updates based on the new UI.

4. Environment Management:

Mocks: Describe data needs for specific tests, and generate mock data or API responses, reducing reliance on real environments and dependencies. Remember you can also use contract tests (with Pact for example) and this can be done automatically from the code.

Prompt Example: Generate mock API responses for the payment gateway integration test, simulating successful and failed scenarios based on test case requirements
Environment Configuration: Configurations for different test environments based on your application and testing requirements.

Prompt Example: Suggest configurations for a staging environment replicating the production database but with limited user access. Include details for network settings and resource allocation.

Thanks to Abel from Graph Protocol 👏 to send over these great resources that I have been using to learn about how to better prompt for Software Development are:

Equal Experts Geek Conference 2023

Hey guys, 4 months ago I had a 5 minutes lightning talk about How the QA will look like in the future at the Equal Experts Conference.

We went through the evolution of the role and how it is right now, then we quickly talk about the trends that are coming so you can already prepare yourself to be up to date 🙂

In this 5-minute talk, we will quickly talk about the future of Quality Assurance (QA) position and discuss the evolution of the QA role in response to emerging trends.

The QA role has come a long way from its traditional focus on manual testing and bug detection. As technology advances, QA professionals are adapting to new demands and becoming integral contributors to the software development process.

The future of QA position will be marked by AI Tests, Tests in the Cloud, Web3 Tests, Alerting and Monitoring, along with strong soft skills. By embracing these trends and developing the necessary skills, QA professionals will be well-equipped to drive quality and innovation in the ever-changing software development industry.

AI for Testing: Beyond Functional Automation webinar

Hello guys, I joined a webinar some months ago (15/07/2020) about AI for Testing: Beyond Functional Automation by Tariq King which was really interesting ! I know how it’s hard to keep up with all the online events now, so I always try to keep the recording of the ones that I couldn’t join and are interesting to listen to when I have time.

So thought about sharing with you as well in case you missed. You will learn about reinforcing learning by giving scores to the right actions and about training bots to recognize good and bad designs with examples. This allows the framework to be more robust when searching for a particular query or asserting the scenarios:

Here it is the link to the recording:

Thanks Tariq King !