• I am surprised that, at least according to all reporting I’ve seen, Apple isn’t planning to build haptic integrations between their existing Watch product and upcoming Vision Pro headset.

    It seems like such a great way to offer additional value to owners of both products.

    I have no doubt Apple could create an uncanny sensory feedback experience for hand tracking using Watch haptics alone. For example, think about the haptic feedback you get when using the Digital Crown to scroll through a list on the watch. Imagine applying that to the act of turning a virtual dial in AR.

    Ever since 2015, the trackpads in all of Apple’s laptops have been solid state—the “click” is simulated, there are no moving parts. They have arguably never felt better. In a sense, they are better than the genuine thing. More real.

    Adding physicality to the VisionOS interface will both ground it in reality and deepen immersion while providing an instant level of familiarity to those using the new platform.

  • The conversation around Apple’s upcoming Worldwide Developers Conference understandably centers around their rumored XR headset. The launch of an entirely new computing platform will undoubtedly make for an exciting event but there is an entirely different set of possibilities that I haven’t seen discussed nearly as much this year. Aside from the big stuff, WWDC always brings a slate of equally impactful, but less flashy, changes. Here is my wishlist from that second category:

    • A dedicated “Passwords” app outside of Settings
      • Bonus: a refresh of Keychain Access on macOS
    • The new journaling app that is rumored for iOS 17 especially if it incorporates data I already have from years past
    • Some love for Apple Mail—I am getting jealous of Mimestream
    • Better widgets, interactive widgets, widgets on the macOS desktop
    • When I pause audio, leave the Dynamic Island’s player controls accessible for longer than a fraction of a second
    • Improve notifications on macOS
    • Add a clipboard manager to iOS and iPadOS
  • It is hard to deny that Google killed it during their IO conference earlier this month. They were clearly kicked into action by AI developments spearheaded by Microsoft and OpenAI and it showed.

    Well, Microsoft held their annual Build conference yesterday. How did they respond?

    With a whimper.

    Microsoft’s close partnership with OpenAI might be their smartest move in recent memory and they are squandering it with a complete lack of any coherent product vision.

    Their big announcement was Copilot for Windows—a button on the Windows 11 taskbar that opens up what appears to be a Bing AI web view. Sure, Microsoft made sure to note that Copilot will be able to “customize the settings” on your PC although I am sure you will still get thrown into control panel if you need to accomplish anything substantial.

    The only other notable announcement is that “Browsing with Bing” will soon be the default ChatGPT experience and that ChatGPT plugins will soon be compatible with Bing AI.

    It isn’t a secret that Bing AI and ChatGPT share the same underlying model from OpenAI. And, unlike Google’s new generative AI augmented search, Microsoft didn’t put any thought into what a meaningful user experience for an AI assisted Bing would look like.

    It is just a chat window, just like ChatGPT.

    I don’t understand why I should want to use Bing AI instead. I don’t think Microsoft knows why, either.

    So Build was boring. Maybe Satya is just happy to have made Google dance. But Google is running now. They haven’t caught up yet but the gap is quickly closing.

  • Google held their annual IO developer conference yesterday and the theme was, unquestionably, artificial intelligence.

    Below are a few quick thoughts on what I think are the most important announcements.

    Bard & PaLM 2

    Bard is no longer behind a waitlist and the underlying model has been updated to PaLM 2.

    PaLM 2 comes in five different sizes. The smallest model, named Gecko, is optimized to run locally on mobile devices. Google didn’t specify which size is currently behind Bard.

    Like ChatGPT, “tools” are coming to Bard. Integrations with first-party Google services will be available first. Later, third-party developers, including Instacart, Adobe Firefly, and Wolfram Alpha, will have plugins available.

    “In the next few weeks” Bard will become more multimodal. It will be able to output images and will accept images as prompt inputs, similar to the (not yet available) GPT-4 multimodal capabilities.

    A new larger and more powerful model, Gemini, is now is training and will be added to Bard at some point in the future.

    Generative AI answers are coming to Search—Google will be adding an “AI Snapshot” section above its traditional search listings.

    Ads will be shown above the AI Snapshot box and products from Google’s Shopping Graph will be suggested as a part of its answers, when relevant.

    You will be able to engage in a freeform back-and-forth chat to elaborate on generated answers. This might be the closest anyone has come to the UI I suggested back in January.

    These new features will be available as a part of the Search Labs program “in the coming weeks.”

    It should be noted that this is not the “Google Inbox Strategy” I was optimistic about a few weeks ago. These features will be coming to Google proper—not some new experimental “Magi” interface. This a bold move; time will tell if it is the correct one.

    Tailwind

    Although it is just a “prototype,” Tailwind might be one of the first true AI-native applications I’ve seen. Google describes it as an “AI-first notebook.” Out of everything announced yesterday, it is Tailwind that we know the least about. If it is done correctly, though, it could be an extremely compelling product.

    This is the type of weird, one off, Inbox-style experiment I want to see more of.

    Sidekick

    In what is clearly a response to Microsoft’s new 365 Copilot feature, Google previewed “Sidekick.” In Google Docs and other Workspace applications there will be an AI-powered side panel that will offer suggestions based off on information from all of your Workspace documents. I think Copilot is a great idea and there is no reason to think this will be any different.

    Google Assistant

    Notable in that there was absolutely no mention of it at the conference.


    I am sure I will have more to say about all of this once these features become public. For now, it is evident that Google feels significant pressure from Microsoft and does not intend to go down without a fight.

  • I recently got access to Anthropic’s Claude language model. A few thought on how it compares to GPT-4 and Bard:

    • Claude is really fast — easily twice as fast as GPT-4.
    • By default, Claude is often much more concise than the other language models. Claude’s answer to the question “When was the bronze age?” was only 11 words long while GPT-4’s was 135 and Bard’s was 138.
    • In my review of Bard, I noted that it is noticeably less “steerable” than GPT-4. Claude is significantly better in this regard. Maybe it is not quite as good as GPT-4 but it is certainly close.

    Which flag am I describing? Three horizontal stripes: blue, white, blue. Slightly left of center is a solid yellow circle. Surrounding that circle is a red letter “C.”

    Claude:

    It sounds like you are describing the flag of Chile.

    GPT-4:

    You are describing the flag of the U.S. state of Colorado. The flag consists of three horizontal stripes of equal width: the top and bottom stripes are blue, and the middle stripe is white. Slightly left of center is a solid yellow circle, which is surrounded by a red letter “C.” The design symbolizes Colorado’s natural features, such as the blue sky, white snow, and abundant sunshine, and the red “C” represents the state’s name.

    Bard:

    The flag you are describing is the flag of the state of Colorado. It was adopted on March 11, 1911. The flag is a horizontal tricolor of blue, white, and blue, with a yellow circle slightly left of center. The circle contains a red letter “C”. The colors of the flag represent the sky, the snow-capped mountains, and the Rocky Mountains. The blue circle represents the sun, and the red letter “C” represents the state of Colorado. The flag is flown at all state buildings and schools in Colorado. It is also flown at sporting events and other public gatherings.

    The answer I was looking for is Colorado. GPT-4 and Bard answered this question correctly. Claude’s “Chile” answer is totally out of left field.


    What is an antonym for the word that describes a fear of confined spaces? Your answer should be one word only.

    Claude:

    Agoraphobia

    GPT-4:

    Agoraphobia

    Bard:

    The antonym for the word that describes a fear of confined spaces is agoraphobia. Agoraphobia is a fear of open spaces or public places.

    Claustrophobia is a fear of enclosed or confined spaces.

    Here, all three models correctly answered “agoraphobia”. Bard looses points for not limiting its answer to one word, though.


    I have a perfectly square piece of paper with a red “X” drawn in the center, extending all the way across the page. What would this drawing look like if I were to fold the paper in half?

    Claude:

    If you fold a square piece of paper in half with a large red ‘X’ drawn across the center, it would look like this:

    [A red X folded in half on a square sheet of paper]

    The vertical line of the ‘X’ would be folded over on top of itself, and the horizontal line of the ‘X’ would also be folded over on itself. So you would see half of the ‘X’ on either side of the folded paper. The intersecting center of the ‘X’ would be along the folded crease of the paper.

    GPT-4:

    When you fold the perfectly square piece of paper in half, you’ll have two possible outcomes depending on the direction you fold it:

    1. Folding the paper horizontally: In this case, you will fold the paper so that the top edge aligns with the bottom edge. The red “X” will now look like two connected “V” shapes, one being the upper half of the original “X” and the other being the lower half. The connected point of the two “V” shapes will be on the fold.

    2. Folding the paper vertically: In this case, you will fold the paper so that the left edge aligns with the right edge. The red “X” will now look like two connected sideways “V” shapes (or “greater than” and “less than” symbols), one being the left half of the original “X” and the other being the right half. The connected point of the two sideways “V” shapes will be on the fold.

    In both cases, the folded paper will have a symmetrical shape of the red “X” along the fold.

    Bard:

    If you fold the paper in half along a vertical or horizontal axis, the X will appear as two identical Xs, one on top of the other. If you fold the paper in half along a diagonal axis, the X will appear as a single X that is twice as wide as it is tall.

    GPT-4’s answer here is very verbose — but bonus points for being thorough I guess? Claude’s answer was okay, but it was slightly nonsensical and the bracketed description in the middle was an interesting choice. Bard totally missed the mark here.

  • I got access to Google’s Bard language model late last week and I have been spending some time testing it out throughout the past few days.

    Like Bing AI (and soon, ChatGPT), Bard is able to preform a web search for factual information lookup. To its credit, Bard seems to do a better job summarizing and integrating this information into its answers when compared to Bing. There is a catch, though, Bard very rarely cites its sources. This almost defeats the purpose of its web lookup capabilities altogether — if you are going to go to the trouble of aggregating outside information I would like to be able to check your work.

    Bard is less “steerable” than ChatGPT. By that, I mean it is more difficult to direct its responses in particular ways — “limit all of your responses to only one word”, “always respond in the Socratic style”, “each word of your answer must begin with the letter W”, etc. This is the magic behind ChatGPT — it is what transformed it into an “intelligent assistant” from the “glorified autocomplete” of GPT-3. OpenAI’s InstructGPT paper has more information on the approach they took towards this.

    Overall, I think Bard would have been a serious contender if it had launched in December of last year, around the time Google issued their infamous “code red” memo. Bard is comparable — a little better in some ways, a little worse in others — than the original GPT-3.5 iteration of ChatGPT. If Bard had launched earlier — before Bing AI, the ChatGPT API, GPT-4, and ChatGPT Plugins — it would have been a serious contender. At this point, though, it feels like Google is still playing catchup to where OpenAI was last year. That is not a great place to be.

  • Me:

    How might education change as a result of increasing artificial intelligence capabilities? In particular, large language models that are capable of passing standardized tests.

    GPT-4:

    The improvement of artificial intelligence capabilities, particularly large language models that can pass standardized tests, is likely to bring significant changes to the education landscape. Some possible ways education might change include:

    1. Personalized learning: AI-driven platforms could tailor educational content to each student’s individual needs, learning style, and pace, allowing for a more customized and efficient learning experience. This could help address the diverse needs of students and ensure that no one is left behind.

    2. Enhanced teaching resources: AI can assist teachers in creating and curating lesson plans, assessments, and other educational materials. By providing access to a vast array of resources and suggestions, AI could help teachers stay up-to-date with the latest teaching methods and content.

    3. Continuous assessment: AI systems can monitor students' progress in real-time, allowing for early identification of learning gaps and targeted interventions. This can lead to improved student outcomes and a more efficient use of instructional time.

    4. Automated grading and feedback: AI can evaluate students' work and provide instant feedback, allowing teachers to focus on other aspects of their job, such as addressing individual student needs or designing more engaging learning experiences.

    5. New teaching methods: The rise of AI might lead to innovative teaching methods and strategies that combine human expertise with machine-driven insights, resulting in more effective and engaging learning experiences.

    6. Virtual tutors and mentors: AI-powered virtual tutors can provide on-demand support to students, helping them with homework, answering questions, and offering guidance. These virtual tutors could be available 24/7, ensuring that students can access help whenever they need it.

    7. Shift in curriculum focus: As AI becomes increasingly capable of performing routine tasks, the focus of education may shift towards nurturing soft skills, creativity, critical thinking, and problem-solving abilities that are difficult for machines to replicate.

    8. Lifelong learning: AI-driven learning platforms can help individuals continuously update their skills and knowledge throughout their lives, promoting a culture of lifelong learning.

    9. Enhanced accessibility: AI technologies could help make education more accessible to students with disabilities, by offering personalized accommodations and support.

    10. Redefining the role of teachers: With AI taking on some of the more administrative and repetitive tasks in education, teachers may have more time to focus on developing strong relationships with their students and serving as mentors and guides.

    However, it is important to consider the potential challenges and ethical concerns that might arise with the increased use of AI in education, such as data privacy, potential biases in AI algorithms, and the risk of over-reliance on technology. It is crucial for stakeholders in education to engage in ongoing discussions to ensure that AI is used responsibly and effectively to enhance the learning experience for all students.

    #7 — “shift in curriculum focus” — is the type of change I would like to see in response to these technological developments. #10 — “redefining the role of teachers” — feels like a necessary extension of that.

    #6 — “virtual tutors and mentors” — seems like the predominate focus right now, judging from the many articles and product announcements in the last week alone. I think it is productive to explore this as a small aspect of education — the part of education where rote memorization and rigid lesson structure is unavoidable. As we move into a more technologically advanced future, an ever-increasing percentage of learning should instead center on hands-on creative problem solving, project-based learning, and social-emotional development.

    Me:

    What are a few good and a few bad ways educators could respond to these changes?

    GPT-4:

    Educators play a crucial role in navigating the changes brought about by the increasing use of AI in education. Here are a few good and bad ways educators could respond to these changes:

    Good ways:

    1. Embrace technology : Educators should be open to adopting new AI-driven tools and technologies to enhance the learning experience. This includes staying up-to-date with the latest developments, exploring different tools, and integrating them into the classroom in a thoughtful manner.

    2. Focus on soft skills and critical thinking: With AI handling more routine tasks, educators can focus on fostering creativity, critical thinking, collaboration, communication, and other soft skills that are essential for success in the 21st century.

    3. Engage in continuous professional development: Educators should seek opportunities for professional development to learn about AI technologies and how they can be effectively integrated into the curriculum. This includes attending workshops, conferences, and online courses.

    4. Collaborate with stakeholders: To ensure the responsible and ethical use of AI in education, educators should collaborate with administrators, policymakers, parents, and students to establish guidelines, share best practices, and address concerns.

    5. Advocate for equitable access: Educators should work to ensure that all students, regardless of socio-economic background, have access to AI-driven educational resources and tools.

    Bad ways:

    1. Resistance to change: A negative response to AI-driven changes would be for educators to resist or dismiss new technologies, potentially hindering student learning and limiting the benefits of AI integration in the classroom.

    2. Over-reliance on technology: While AI can be a valuable educational tool, it should not replace the role of human educators. Teachers who rely too heavily on AI-driven tools and neglect the importance of personal connections and mentorship may fail to provide a well-rounded education.

    3. Ignoring potential biases and ethical concerns: Educators who do not pay attention to the ethical implications of AI use in education, such as biases in algorithms and data privacy concerns, may inadvertently contribute to perpetuating inequalities and harm students.

    4. Neglecting digital literacy: Failing to teach students about digital literacy, including the responsible use of AI-driven tools and the potential risks involved, may leave students ill-equipped to navigate the digital landscape.

    5. Not addressing accessibility concerns: Ignoring the needs of students with disabilities or those from disadvantaged backgrounds when implementing AI-driven tools may exacerbate existing educational inequalities.

    #2 — “over-reliance on technology” — and #4 — “neglecting digital literacy” — from the “bad responses” list both bring up great points. Teachers should educate themselves and their students on the situations where computers (automation, AI, etc.) are the “right tool for the job.” When computation is the best tool for a given problem, students should know how to effectively and efficiently use it. When it is not the best tool, though, students should have the broad array of literacies and confidence necessary to approach the problem in whichever manner is best.


    Previously:

  • For whatever reason, I have never had the brain for mold making; any kind of intuitive understanding of the process alludes me. When to use a two part mold, what objects are even suitable for casting, etc. Despite all of this, I periodically get the itch to try it again which is exactly what I did this weekend.

    I ordered some jesmonite, an interesting cross between plaster and resin that is really difficult to find in the United States despite being quite popular in the U.K, and decided to try casting a sycamore tree seed and two decorative gourds I grew last summer.

    I was completely unable to remove the sycamore seed from the silicone mold. It was probably too rough and porous. Next time I’ll try using some sort of mold release.

    The two gourds came out great though! Afterwards, I tried painting them with watercolors which worked much better than I was expecting it to.

  • I spent a good chunk of my day yesterday experimenting with Analog Diffusion — a Stable Diffusion model fine-tuned on film photography images. I have to say, I have been really impressed with almost everything this model has produced so far.

    I can’t find any information online about which particular categories of images were part of the fine-tuning process so I started testing how the outcome would change when specifying different popular 35mm film stocks in the prompt:

    The model certainly knows that Tri-X and HP5+ are both black and white films while Portra and Ektar are color films. Less clear to me is if there are any traits the model attributes to any single film stock in particular (for example, if images generated from “Ektar” prompts had more saturated colors than equivalent “Portra” prompt images).

    Finally, a few open questions I would like to investigate soon:

    • How well can the model emulate alternative photography processes like Daguerreotype, Collodion, and Tin Type?
    • What about alternative development processes? Kodak D-76 vs. caffenol vs. stand development?
    • Does specifying a high or low ISO have any perceivable effect?
    • Do particular lenses and focal lengths behave as expected? How different are Summilux 35mm images from Summarit 120mm images?
  • I teach a technology class to students from Kindergarten to fifth grade. The accelerated development of truly impressive AI models recently — especially ChatGPT and Stable Diffusion — has made it clear to me how dramatically different technological literacy will be when my students eventually enter the world as adults.

    As we move into a future with increased technological automation, forward-looking curricula across all subject matters must focus on fostering creativity in students. Although AI can make new connections between elements in its training data, it is humans alone that are capable of generating truly novel ideas.

    I believe teaching young students how to code will continue to be important to some extent. However, with the rise of code generation technologies like GitHub CoPilot, the most durable programming skills might be spec-writing, debugging, and revising. Physical electronics and robotics will arguably rise in relevance for the foreseeable future. Understanding and debugging systems will be an important skill here, too.

    It would be great to hear from other educators that are thinking through similar topics right now.

subscribe via RSS